diff --git a/README.md b/README.md index 9584fba43..dbd73678d 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,105 @@ Get 10% OFF GLM CODING PLAN:https://z.ai/subscribe?ic=8JVLJQFSKB - OpenAI-compatible upstream providers via config (e.g., OpenRouter) - Reusable Go SDK for embedding the proxy (see `docs/sdk-usage.md`) +## Operational Enhancements + +This fork includes additional "proxy ops" features beyond the mainline release to improve third-party provider integrations: + +### Core Features +- Environment-based secret loading via `os.environ/NAME` +- Strict YAML parsing via `strict-config` / `CLIPROXY_STRICT_CONFIG` +- Optional encryption-at-rest for `auth-dir` credentials + atomic/locked writes +- Prometheus metrics endpoint (configurable `/metrics`) + optional auth gate (`metrics.require-auth`) +- In-memory response cache (LRU+TTL) for non-streaming JSON endpoints +- Rate limiting (global / per-key parallelism + per-key RPM + per-key TPM) +- Request/response size limits (`limits.max-*-size-mb`) +- Request body guardrail (reject `api_base` / `base_url` by default) +- Virtual keys (managed client keys) + budgets + pricing-based spend tracking +- Fallback chains (`fallback-chains`) + exponential backoff retries (`retry-policy`) +- Pass-through endpoints (`pass-through.endpoints[]`) for forwarding extra routes upstream +- Health endpoints (`/health/liveness`, `/health/readiness`) + optional background probes +- Sensitive-data masking (request logs + redacted management config view) + +### Health-Based Routing & Smart Load Balancing + +CLIProxyAPIPlus now includes intelligent routing and health tracking based on production-grade proxy patterns: + +#### Features + +**Health Tracking System** +- Automatic monitoring of credential health based on failure rates and response latency +- Four health status levels: HEALTHY, DEGRADED, COOLDOWN, ERROR +- Rolling window metrics (configurable 60-second default) +- Per-credential and per-model statistics tracking +- P95/P99 latency percentile calculations +- Automatic cooldown integration + +**Advanced Routing Strategies** +- **`fill-first`**: Drain one credential before moving to the next (default) +- **`round-robin`**: Sequential credential rotation +- **`random`**: Random credential selection +- **`least-busy`**: Select credential with fewest active requests (load balancing) +- **`lowest-latency`**: Select credential with best P95 latency (performance optimization) + +**Health-Aware Routing** +- Automatically filter out COOLDOWN and ERROR credentials +- Prefer HEALTHY credentials over DEGRADED when `prefer-healthy: true` +- Graceful fallback to all credentials when no healthy ones available + +#### Configuration Example + +```yaml +# Health tracking configuration +health-tracking: + enable: true + window-seconds: 60 # Rolling window for failure rate calculation + failure-threshold: 0.5 # 50% failure rate triggers ERROR status + degraded-threshold: 0.1 # 10% failure rate triggers DEGRADED status + min-requests: 5 # Minimum requests before tracking + cleanup-interval: 300 # Cleanup old data every 5 minutes + +# Enhanced routing configuration +routing: + strategy: "least-busy" # fill-first, round-robin, random, least-busy, lowest-latency + health-aware: true # Filter unhealthy credentials (COOLDOWN, ERROR) + prefer-healthy: true # Prioritize HEALTHY over DEGRADED credentials +``` + +#### Routing Strategy Comparison + +| Strategy | Best For | How It Works | +|----------|----------|--------------| +| `fill-first` | Staggering rolling caps | Uses the first available credential (by ID) until it cools down | +| `round-robin` | Even distribution, predictable | Cycles through credentials sequentially | +| `random` | Simple load balancing | Randomly selects from available credentials | +| `least-busy` | Optimal load distribution | Selects credential with fewest active requests | +| `lowest-latency` | Performance-critical apps | Selects credential with best P95 latency | + +#### Health Status Levels + +- **HEALTHY**: Normal operation, low failure rates +- **DEGRADED**: Elevated failure rates (above degraded-threshold but below failure-threshold) +- **COOLDOWN**: Temporarily unavailable due to errors or rate limits +- **ERROR**: High failure rates (above failure-threshold) or persistent errors + +#### Benefits + +- **Improved reliability** by avoiding unhealthy credentials when `health-aware` routing is enabled +- **Better tail latency** when `lowest-latency` is enabled and health tracking has enough data +- **Smarter load balancing** with `least-busy` using in-flight request counts +- **Automatic recovery** from cooldown windows as health improves + +See: +- `docs/operations.md` + +### Future work + +These are high-value ideas that remain on the roadmap: +- OpenTelemetry tracing + external integrations (Langfuse/Sentry/webhooks) +- Redis-backed distributed cache/rate limits for multi-instance deployments +- DB-backed virtual key store + async spend log writer +- Broader endpoint coverage via native translators (beyond pass-through) + ## Getting Started CLIProxyAPI Guides: [https://help.router-for.me/](https://help.router-for.me/) diff --git a/config.example.yaml b/config.example.yaml index 61f51d475..b0e1f2db5 100644 --- a/config.example.yaml +++ b/config.example.yaml @@ -1,3 +1,15 @@ +# Server host/interface. Use "127.0.0.1" or "localhost" to restrict access to local machine only. +host: "" + +# Any string value can be sourced from an environment variable by using: +# os.environ/ENV_VAR_NAME +# Example: +# remote-management: +# secret-key: os.environ/MANAGEMENT_PASSWORD + +# Strict YAML parsing (reject unknown fields). Useful to catch typos. +# strict-config: true + # Server port port: 8317 @@ -21,9 +33,25 @@ remote-management: # Disable the bundled management control panel asset download and HTTP route when true. disable-control-panel: false + # Allow downloading auth JSON files via management endpoints from non-localhost clients. + # Disabled by default to reduce the risk of credential exfiltration. + allow-auth-file-download: false + + # GitHub repository for the management control panel. Accepts a repository URL or releases API URL. + panel-github-repository: "https://github.com/router-for-me/Cli-Proxy-API-Management-Center" + # Authentication directory (supports ~ for home directory) auth-dir: "~/.cli-proxy-api" +# Auth file storage settings (credentials saved under auth-dir as *.json) +auth-storage: + # Encrypt auth JSON at rest. If omitted, encryption is auto-enabled when an encryption key is present. + # encrypt: true + # Encryption key secret. Prefer setting via env (CLIPROXY_AUTH_ENCRYPTION_KEY) and referencing it: + # encryption-key: os.environ/CLIPROXY_AUTH_ENCRYPTION_KEY + # Allow reading legacy plaintext auth JSON when encryption is enabled (best-effort migrates to encrypted). + # allow-plaintext-fallback: true + # API keys for authentication api-keys: - "your-api-key-1" @@ -41,12 +69,24 @@ usage-statistics-enabled: false # Proxy URL. Supports socks5/http/https protocols. Example: socks5://user:pass@192.168.1.1:1080/ proxy-url: "" +# Security guardrails. When disabled (default), requests containing api_base/base_url fields are rejected. +# security: +# allow-client-side-credentials: false + +# Request/response size limits (max_request_size_mb/max_response_size_mb). +# limits: +# max-request-size-mb: 10 +# max-response-size-mb: 50 + # Number of times to retry a request. Retries will occur if the HTTP response code is 403, 408, 500, 502, 503, or 504. request-retry: 3 # Maximum wait time in seconds for a cooled-down credential before triggering a retry. max-retry-interval: 30 +# When true, disable quota backoff cooldown scheduling for 429 errors (not recommended). +disable-cooling: false + # Quota exceeded behavior quota-exceeded: switch-project: true # Whether to automatically switch to another project when a quota is exceeded @@ -55,6 +95,116 @@ quota-exceeded: # When true, enable authentication for the WebSocket API (/v1/ws). ws-auth: false +# Response caching configuration +# cache: +# enable: true # Enable response caching +# max-size: 1000 # Maximum number of cached responses +# ttl: 300 # Cache TTL in seconds (default: 5 minutes) + +# Rate limiting configuration +# rate-limits: +# enable: true # Enable rate limiting +# max-parallel-requests: 100 # Maximum concurrent requests globally +# max-per-key: 10 # Maximum concurrent requests per API key +# max-rpm: 60 # Maximum requests per minute per key +# max-tpm: 120000 # Maximum tokens per minute per API key + +# Prometheus metrics configuration +# metrics: +# enable: true # Enable metrics endpoint +# endpoint: "/metrics" # HTTP path for metrics +# require-auth: false # When true, /metrics requires normal API auth + +# Credential cooldown configuration +# cooldown: +# enable: true # Enable automatic cooldown on errors +# duration: 60 # Cooldown duration in seconds +# trigger-on: # HTTP status codes that trigger cooldown +# - 429 +# - 500 +# - 502 +# - 503 +# - 504 + +# Routing / selection strategy when multiple credentials match. +# routing: +# strategy: "fill-first" # fill-first (default), round-robin, random, least-busy, lowest-latency +# health-aware: true # Filter unhealthy credentials (COOLDOWN, ERROR) +# prefer-healthy: true # Prefer HEALTHY over DEGRADED when health-aware +# fill-first-max-inflight-per-auth: 4 # Default: 4 (nil). 0 = unlimited +# fill-first-spillover: "next-auth" # next-auth (default), least-busy + +# Health tracking (feeds health-aware routing + readiness checks). +# health-tracking: +# enable: true +# window-seconds: 60 +# failure-threshold: 0.5 +# degraded-threshold: 0.1 +# min-requests: 5 +# cleanup-interval: 300 + +# Fallback chains (model/provider failover). +# Fallbacks are attempted on transient failures (network, 408, 429, 5xx). +# fallback-chains: +# enable: true +# chains: +# - primary-model: "gpt-4o" +# primary-provider: "openai" # optional +# fallbacks: +# - model: "claude-3-5-sonnet-20241022" +# provider: "claude" +# - model: "gemini-2.0-flash-exp" +# provider: "gemini" + +# Retry policy (exponential backoff). +# Applies to transient failures (network, 408, 5xx). 429 relies on cooldown/Retry-After instead. +# retry-policy: +# enable: true +# max-retries: 3 +# initial-delay-ms: 1000 +# max-delay-ms: 30000 +# multiplier: 2.0 +# jitter: 0.1 + +# Streaming behavior (SSE keep-alives + safe stream bootstrap retries). +# streaming: +# keepalive-seconds: 15 # Default: 15 (nil). <= 0 disables keep-alives +# bootstrap-retries: 2 # Default: 2 (nil). 0 disables bootstrap retries + +# Virtual keys (managed client keys). +# virtual-keys: +# enable: true +# store-file: "" # default: /virtual_keys.json +# flush-interval: 5 # seconds + +# Pricing table (for spend/budget enforcement on virtual keys). +# pricing: +# enable: true +# default: +# input-per-1k: 0.0 +# output-per-1k: 0.0 +# models: +# - match: "gpt-4o*" +# input-per-1k: 5.0 +# output-per-1k: 15.0 + +# Pass-through endpoints (forward unimplemented routes upstream). +# pass-through: +# enable: true +# endpoints: +# - path: "/v1/rerank" +# method: "POST" +# base-url: "https://api.openai.com" # note: do not include /v1 to avoid double /v1/v1 +# timeout: 60 +# headers: +# Authorization: "Bearer os.environ/OPENAI_API_KEY" + +# Health endpoints + optional background probes (lightweight TCP dials). +# health: +# background-checks: +# enable: true +# interval: 300 # seconds + # Gemini API keys # gemini-api-key: # - api-key: "AIzaSy...01" diff --git a/docs/operations.md b/docs/operations.md new file mode 100644 index 000000000..981820568 --- /dev/null +++ b/docs/operations.md @@ -0,0 +1,377 @@ +# Operations (Security + Observability) + +This proxy borrows operational patterns from production-grade systems: environment-based secret loading, safe credential storage, guardrails (rate limits / cooldowns), response caching, and Prometheus metrics. + +## Environment-Sourced Secrets (`os.environ/`) + +Any string value in `config.yaml` can be set from an environment variable by using the prefix: + +```yaml +some-key: os.environ/MY_ENV_VAR +``` + +The config loader resolves these references after YAML unmarshal (works for nested structs, slices, and maps). + +If the env var is missing, startup fails (unless running in optional/cloud-deploy mode). + +- Keeps secrets out of `config.yaml` by referring to env vars instead of hard-coding secrets. +- Makes it easier to run the same config across machines/environments. + +### Safety note (no “secret persistence”) +When `os.environ/` references are resolved, config normalization steps that would normally write back to disk are skipped to avoid accidentally writing the resolved secret into `config.yaml`. + +## Strict Config Parsing (Reject Unknown YAML Fields) + +Strongly typed proxies typically surface unknown fields quickly. In Go/YAML it’s easy to silently ignore typos, so CLIProxyAPI supports strict parsing: + +```yaml +strict-config: true +``` + +You can also force strict parsing via env: +- `CLIPROXY_STRICT_CONFIG=true` + +## Encrypted Auth Storage (auth-dir) + +Auth JSON files under `auth-dir` can be encrypted-at-rest and are always written using: +- file locking +- atomic replace +- `0600` permissions + +Config: + +```yaml +auth-storage: + encrypt: true + encryption-key: os.environ/CLIPROXY_AUTH_ENCRYPTION_KEY + allow-plaintext-fallback: true +``` + +Also supported via env: `CLIPROXY_AUTH_ENCRYPTION_KEY` (or legacy `CLI_PROXY_API_AUTH_ENCRYPTION_KEY`). + +### What gets encrypted +- Files under `auth-dir` (typically `*.json`) created by login flows or uploaded via management endpoints. +- The stored format is an **envelope JSON** (AES-256-GCM). The plaintext JSON is only recovered in-memory. + +### Migration behavior +If encryption is enabled and `allow-plaintext-fallback: true`, legacy plaintext auth files are still readable and will be best-effort rewritten into the encrypted envelope format. + +### Remote stores (Postgres/Object store) +If you mirror auth files to Postgres/S3-backed stores, the raw bytes are stored as-is. When encryption is enabled, those remote payloads remain encrypted envelopes. + +## Prometheus Metrics + +Enable the metrics endpoint: + +```yaml +metrics: + enable: true + endpoint: "/metrics" + require-auth: false +``` + +Metrics include request counts/latency, token totals, cache hits/misses, rate-limit rejections, and cooldown counters. + +Key metric names: +- `cliproxy_requests_total` +- `cliproxy_request_duration_ms` +- `cliproxy_tokens_input_total` / `cliproxy_tokens_output_total` +- `cliproxy_cache_hits_total` / `cliproxy_cache_misses_total` +- `cliproxy_ratelimit_rejections_total` +- `cliproxy_cooldowns_triggered_total` + +## Response Cache + +Enable in-memory response caching: + +```yaml +cache: + enable: true + max-size: 1000 + ttl: 300 +``` + +### What is cached +- Only **non-streaming** requests. +- Only JSON responses with **2xx** status. +- Applies to: + - `POST /v1/chat/completions` + - `POST /v1/completions` + - `POST /v1/responses` (OpenAI Responses API) + - `POST /v1/messages` + +### Cache key +Cache keys include the authenticated `apiKey` + method + path + query + request body, so different users/inputs do not collide. + +### Response header +Cached requests return `X-CLIProxy-Cache: HIT` (and uncached attempts return `X-CLIProxy-Cache: MISS`). + +## Rate Limits + +Configure concurrency + RPM limits: + +```yaml +rate-limits: + enable: true + max-parallel-requests: 100 + max-per-key: 10 + max-rpm: 60 + max-tpm: 120000 +``` + +Rate-limited requests return HTTP `429` with `{"error":"rate_limited", ...}` and increment `cliproxy_ratelimit_rejections_total`. + +### Token-Per-Minute (TPM) + +TPM limits protect upstream quotas from a small number of very large requests. + +Notes: +- TPM is tracked per authenticated principal (`cfg:` for static `api-keys`, `vk:` for virtual keys). +- Tokens are recorded after request completion (usage plugin), so enforcement is best-effort and may allow brief bursts. + +## Request/Response Size Limits + +CLIProxyAPI supports request/response size caps: + +```yaml +limits: + max-request-size-mb: 10 + max-response-size-mb: 50 +``` + +Behavior: +- Request bodies above the cap return HTTP `413`. +- When `max-response-size-mb` is set, non-streaming upstream responses larger than the cap return HTTP `502`. + +## Cooldown Override + +Optionally apply a fixed cooldown window for specific HTTP status codes: + +```yaml +cooldown: + enable: true + duration: 60 + trigger-on: [429, 500, 502, 503, 504] +``` + +This is a simple “guardrail cooldown” that prevents immediate re-selection of a credential after repeated error codes. If the upstream returns `Retry-After`, that value is honored/extended. + +Note: quota backoff for 429 is still controlled separately via `disable-cooling`. + +## Fallback Chains (Cross-Provider Failover) + +Fallback chains provide model/provider failover on transient failures (network, 408, 429, 5xx): + +```yaml +fallback-chains: + enable: true + chains: + - primary-model: "gpt-4o" + fallbacks: + - model: "claude-3-5-sonnet-20241022" + provider: "claude" +``` + +When a fallback succeeds, responses include `X-CLIProxy-Fallback` headers for debugging. + +## Retry Policy (Exponential Backoff) + +`retry-policy` adds exponential backoff retries for transient failures (network, 408, 5xx): + +```yaml +retry-policy: + enable: true + max-retries: 3 + initial-delay-ms: 1000 + max-delay-ms: 30000 + multiplier: 2.0 + jitter: 0.1 +``` + +Notes: +- 429 is intentionally not retried via backoff; prefer cooldown/Retry-After. +- This is additive to the existing cooldown-based `request-retry` behavior. +- For OpenAI-compatible upstreams, you can pass `Idempotency-Key` to reduce duplicate charges when retries occur. + +## Routing Strategy + +When multiple credentials match, you can choose a selection strategy: + +```yaml +routing: + strategy: "fill-first" # fill-first (default), round-robin, random, least-busy, lowest-latency + health-aware: true # Filter unhealthy credentials (COOLDOWN, ERROR) + prefer-healthy: true # Prefer HEALTHY over DEGRADED when health-aware + fill-first-max-inflight-per-auth: 4 # 0 = unlimited + fill-first-spillover: "next-auth" # next-auth (default), least-busy +``` + +Notes: +- `least-busy` uses in-flight request counts; `lowest-latency` requires `health-tracking.enable: true`. +- `fill-first` “burns” one account first; spillover prevents overload under bursty concurrency. +- `next-auth` preserves deterministic “drain first”; `least-busy` maximizes throughput. + +### Fill-first spillover (recommended for “many creds”) + +`fill-first` intentionally “burns” one account first (to stagger rolling-window subscription caps), but with many concurrent terminals it can also overload a single credential, leading to avoidable `429` errors. Use `fill-first-max-inflight-per-auth` and `fill-first-spillover` to keep the intent while enabling safe spillover. + +- When the preferred credential is at capacity (`max-inflight`), selection spills over to another credential instead of overloading one. +- `next-auth` preserves deterministic “drain first”; `least-busy` maximizes throughput under bursty load. + +Health-aware filtering uses `health-aware` and `prefer-healthy` (requires `health-tracking.enable: true`). + +## Streaming (Keep-Alives + Safe Bootstrap Retries) + +Streaming failures are only safe to “retry/fail over” **before any bytes are written** to the client. After that, a retry would duplicate/diverge output. + +```yaml +streaming: + keepalive-seconds: 15 # SSE heartbeats (: keep-alive\n\n); <= 0 disables + bootstrap-retries: 2 # retries allowed before first byte; 0 disables +``` + +Notes: +- Keep-alives reduce idle timeouts (Cloudflare/Nginx/proxies) during long pauses between chunks. +- Bootstrap retries/fallbacks only run if the stream fails before producing any payload (safe failover). + +## “10 Terminals / Many Subscriptions” Recommended Defaults + +This configuration biases toward **predictable** routing (burn one account first) while reducing avoidable interruptions under bursty concurrency: + +```yaml +routing: + strategy: "fill-first" + health-aware: true + prefer-healthy: true + fill-first-max-inflight-per-auth: 4 + fill-first-spillover: "next-auth" + +health-tracking: + enable: true + +cooldown: + enable: true + duration: 60 + trigger-on: [429, 500, 502, 503, 504] + +retry-policy: + enable: true + max-retries: 3 + initial-delay-ms: 1000 + max-delay-ms: 30000 + multiplier: 2.0 + jitter: 0.1 + +streaming: + keepalive-seconds: 15 + bootstrap-retries: 2 +``` + +## Request Body Guardrails (Client-Side Upstream Targets) + +To prevent redirect attacks, CLIProxyAPI blocks `api_base` / `base_url` in request bodies by default: + +```yaml +security: + allow-client-side-credentials: false +``` + +When disabled (default), requests containing `api_base` or `base_url` are rejected with HTTP `400`. + +## Virtual Keys (Managed Client Keys) + +This pattern generates per-user/team keys without editing `config.yaml`. + +Enable: + +```yaml +virtual-keys: + enable: true +``` + +Management endpoints (require management key): +- `GET /v0/management/virtual-keys` +- `POST /v0/management/virtual-keys` (returns plaintext key once) +- `DELETE /v0/management/virtual-keys/:selector` +- `GET /v0/management/virtual-keys/:selector/budget` + +Policy enforcement (automatic for `vk:*` principals): +- Budget caps (tokens and/or USD) with fixed windows +- Model allowlists (wildcards) +- Per-key model aliases (`model_aliases`) applied by rewriting the request JSON `model` + +## Pricing (Spend Tracking) + +Virtual-key cost budgets require pricing rules: + +```yaml +pricing: + enable: true + models: + - match: "gpt-4o*" + input-per-1k: 5.0 + output-per-1k: 15.0 +``` + +When `pricing.enable: false`, virtual keys can still enforce token budgets, but cost budgets will return `cost_unknown`. + +## Pass-Through Endpoints + +Pass-through routes forward requests to an upstream base URL without writing a full translator. + +```yaml +pass-through: + enable: true + endpoints: + - path: "/v1/rerank" + method: "POST" + base-url: "https://api.openai.com" + timeout: 60 + headers: + Authorization: "Bearer os.environ/OPENAI_API_KEY" +``` + +Security behavior: +- Hop-by-hop headers are stripped. +- Proxy auth headers (`Authorization`, `X-Goog-Api-Key`, `X-Api-Key`) are stripped and must be provided via `headers`. +- If the proxy key was provided via query (`?key=` / `?auth_token=`), that parameter is removed from the forwarded query string. + +## Health Endpoints + Background Probes + +Endpoints: +- `GET /health/liveness` (fast, no upstream calls) +- `GET /health/readiness` (feature status + optional probe summary) +- `GET /health` (alias for readiness) + +Optional background probes: + +```yaml +health: + background-checks: + enable: true + interval: 300 +``` + +Probes are lightweight TCP connectivity checks to configured provider base URLs (no auth, no quota usage). + +## Management API Hardening + +- Auth file downloads are blocked for non-local clients by default. +- To allow it, set: + ```yaml + remote-management: + allow-auth-file-download: true + ``` + +### Auth file download behavior +- By default, downloads return the stored bytes (encrypted envelope if encryption is enabled). +- `GET /v0/management/auth-files/download?name=...&decrypt=1` is **localhost-only** and returns plaintext JSON (requires encryption key when files are encrypted). + +New endpoints: +- `GET /v0/management/auth-files/errors` +- `GET /v0/management/auth-providers` +- `GET /v0/management/virtual-keys` (+ create/revoke/budget) + +### Config Redaction + +`GET /v0/management/config` returns a redacted config view (API keys/tokens masked). Use `GET /v0/management/config.yaml` to fetch the raw file (preserves comments). diff --git a/docs/sdk-advanced.md b/docs/sdk-advanced.md index 3a9d3e500..334216258 100644 --- a/docs/sdk-advanced.md +++ b/docs/sdk-advanced.md @@ -60,6 +60,7 @@ func (Executor) Refresh(ctx context.Context, a *coreauth.Auth) (*coreauth.Auth, Register the executor with the core manager before starting the service: ```go +// nil selector uses the default "fill-first" selection strategy. core := coreauth.NewManager(coreauth.NewFileStore(cfg.AuthDir), nil, nil) core.RegisterExecutor(myprov.Executor{}) svc, _ := cliproxy.NewBuilder().WithConfig(cfg).WithConfigPath(cfgPath).WithCoreAuthManager(core).Build() @@ -135,4 +136,3 @@ The embedded server calls this automatically for built‑in providers; for custo - Enable request logging: Management API GET/PUT `/v0/management/request-log` - Toggle debug logs: Management API GET/PUT `/v0/management/debug` - Hot reload changes in `config.yaml` and `auths/` are picked up automatically by the watcher - diff --git a/docs/sdk-advanced_CN.md b/docs/sdk-advanced_CN.md index 25e6e83c9..c9ed8b57f 100644 --- a/docs/sdk-advanced_CN.md +++ b/docs/sdk-advanced_CN.md @@ -55,6 +55,7 @@ func (Executor) Refresh(ctx context.Context, a *coreauth.Auth) (*coreauth.Auth, 在启动服务前将执行器注册到核心管理器: ```go +// selector 传 nil 时默认使用 "fill-first" 选择策略。 core := coreauth.NewManager(coreauth.NewFileStore(cfg.AuthDir), nil, nil) core.RegisterExecutor(myprov.Executor{}) svc, _ := cliproxy.NewBuilder().WithConfig(cfg).WithConfigPath(cfgPath).WithCoreAuthManager(core).Build() @@ -128,4 +129,3 @@ cliproxy.GlobalModelRegistry().RegisterClient(authID, "myprov", models) - 启用请求日志:管理 API GET/PUT `/v0/management/request-log` - 切换调试日志:管理 API GET/PUT `/v0/management/debug` - 热更新:`config.yaml` 与 `auths/` 变化会自动被侦测并应用 - diff --git a/docs/sdk-usage.md b/docs/sdk-usage.md index 55e7d5f9a..8a425dc35 100644 --- a/docs/sdk-usage.md +++ b/docs/sdk-usage.md @@ -81,6 +81,7 @@ These options mirror the internals used by the CLI server. The service uses a core `auth.Manager` for selection, execution, and auto‑refresh. When embedding, you can provide your own manager to customize transports or hooks: ```go +// nil selector uses the default "fill-first" selection strategy. core := coreauth.NewManager(coreauth.NewFileStore(cfg.AuthDir), nil, nil) core.SetRoundTripperProvider(myRTProvider) // per‑auth *http.Transport diff --git a/docs/sdk-usage_CN.md b/docs/sdk-usage_CN.md index b87f9aa1f..135ccf0b7 100644 --- a/docs/sdk-usage_CN.md +++ b/docs/sdk-usage_CN.md @@ -81,6 +81,7 @@ svc, _ := cliproxy.NewBuilder(). 服务内部使用核心 `auth.Manager` 负责选择、执行、自动刷新。内嵌时可自定义其传输或钩子: ```go +// selector 传 nil 时默认使用 "fill-first" 选择策略。 core := coreauth.NewManager(coreauth.NewFileStore(cfg.AuthDir), nil, nil) core.SetRoundTripperProvider(myRTProvider) // 按账户返回 *http.Transport @@ -161,4 +162,3 @@ _ = svc.Shutdown(ctx) - 热更新:`config.yaml` 与 `auths/` 变化会被自动侦测并应用。 - 请求日志可通过管理 API 在运行时开关。 - `gemini-web.*` 相关配置在内嵌服务器中会被遵循。 -