LLM Proxy

HTTP proxy for LLM APIs with streaming support and chunk processing.

Usage

./llm-proxy          Start the proxy
./llm-proxy -h      Show help

Configuration

Configuration file (optional): llm-proxy.toml

Environment variables take priority over config file.

Variable	Alternative	Description	Default
`UPSTREAM_URL`	`OPENAI_API_BASE`	Upstream LLM API URL	`https://api.openai.com/v1/chat/completions`
`LISTEN_ADDR`	-	Listen address	`127.0.0.1:8080`
`API_KEY`	`OPENAI_API_KEY`	Upstream API key	-
`INSECURE`	-	Skip TLS verification	`false`

Example

# Via environment variables
UPSTREAM_URL=https://api.openai.com/v1/chat/completions \
API_KEY=sk-... \
LISTEN_ADDR=127.0.0.1:8080 \
./llm-proxy

# Via config file (llm-proxy.toml)
upstream_url = "https://api.openai.com/v1/chat/completions"
listen_addr = "127.0.0.1:8080"
api_key = "sk-..."
insecure = false

Endpoints

GET /health - Health check
/* - Proxies all requests to upstream

Streaming

Supports SSE (text/event-stream) and NDJSON (application/x-ndjson) streaming. Each chunk is processed via processChunk() before forwarding.