50 lines
1.2 KiB
Markdown
50 lines
1.2 KiB
Markdown
# LLM Proxy
|
|
|
|
HTTP proxy for LLM APIs with streaming support and chunk processing.
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
./llm-proxy Start the proxy
|
|
./llm-proxy -h Show help
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Configuration file (optional): `llm-proxy.toml`
|
|
|
|
Environment variables take priority over config file.
|
|
|
|
| Variable | Alternative | Description | Default |
|
|
|----------|-------------|-------------|---------|
|
|
| `UPSTREAM_URL` | `OPENAI_API_BASE` | Upstream LLM API URL | `https://api.openai.com/v1/chat/completions` |
|
|
| `LISTEN_ADDR` | - | Listen address | `127.0.0.1:8080` |
|
|
| `API_KEY` | `OPENAI_API_KEY` | Upstream API key | - |
|
|
| `INSECURE` | - | Skip TLS verification | `false` |
|
|
|
|
## Example
|
|
|
|
```bash
|
|
# Via environment variables
|
|
UPSTREAM_URL=https://api.openai.com/v1/chat/completions \
|
|
API_KEY=sk-... \
|
|
LISTEN_ADDR=127.0.0.1:8080 \
|
|
./llm-proxy
|
|
```
|
|
|
|
```toml
|
|
# Via config file (llm-proxy.toml)
|
|
upstream_url = "https://api.openai.com/v1/chat/completions"
|
|
listen_addr = "127.0.0.1:8080"
|
|
api_key = "sk-..."
|
|
insecure = false
|
|
```
|
|
|
|
## Endpoints
|
|
|
|
- `GET /health` - Health check
|
|
- `/*` - Proxies all requests to upstream
|
|
|
|
## Streaming
|
|
|
|
Supports SSE (`text/event-stream`) and NDJSON (`application/x-ndjson`) streaming. Each chunk is processed via `processChunk()` before forwarding. |