[WIP] Add flags that remove unneeded imports#26294
[WIP] Add flags that remove unneeded imports#26294Michael-RZ-Berri wants to merge 1 commit intolitellm_internal_stagingfrom
Conversation
|
Michael Riad Zaky seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Greptile SummaryThis PR defers ~25 optional feature router imports behind a new
Confidence Score: 3/5Not safe to merge — the env var naming/documentation issue can silently disable guardrails and other previously-unconditional endpoints in production. Two P1 findings: (1) the LITELLM_SUPPORTED_DB_OBJECTS env var is reused with semantics different from the existing YAML key, and the docstring example gives values that are not valid feature-router names, meaning any deployment following the docs would silently lose endpoint availability; (2) guardrails, compliance, jwt-mappings, config-overrides, and access-groups were always registered before this PR and are now silently dropped when the env var is set. The memory-saving approach is sound but needs a distinct env var name or corrected documentation and startup logging before merge. litellm/proxy/proxy_server.py — _should_import_feature, _register_optional_feature_routers, and their call site at module scope.
|
| Filename | Overview |
|---|---|
| litellm/proxy/proxy_server.py | Introduces _register_optional_feature_routers to lazy-load ~25 feature routers behind the LITELLM_SUPPORTED_DB_OBJECTS env var, but the env var name conflicts with the existing YAML key semantics, the documented example values are not valid router names, and previously-unconditional routers (guardrails, compliance, jwt-mappings, config-overrides) are silently dropped when the env var is set. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[proxy_server.py module import] --> B{LITELLM_SUPPORTED_DB_OBJECTS set?}
B -- "No (default)" --> C[_should_import_feature returns True for all]
B -- "Yes (CSV)" --> D[_should_import_feature checks name in allowed set]
C --> E[All optional routers imported & registered]
D --> F{Feature name in CSV?}
F -- Yes --> G[Router imported & app.include_router called]
F -- No --> H[Router skipped — endpoints return 404]
G --> I[FastAPI app ready]
H --> I
E --> I
subgraph "Core routers always loaded"
J[key_management_router]
K[team_router]
L[spend_management_router]
M[response_router / batches_router]
end
subgraph "Previously-unconditional now gated"
N[guardrails_router]
O[compliance_router]
P[jwt_key_mapping_router]
Q[config_override_router]
R[usage_ai_router]
end
subgraph "New lazy-load targets memory savings"
T[MCP routers ~300 MB]
U[agent + a2a routers]
V[vector_store routers]
W[anthropic / google / langfuse passthrough]
end
Reviews (1): Last reviewed commit: "add flags that remove unneeded imports" | Re-trigger Greptile
| LITELLM_SUPPORTED_DB_OBJECTS: "models,cache_settings,anthropic_beta_headers" | ||
|
|
||
| to skip importing feature routers for everything else. | ||
| """ | ||
| raw = os.environ.get("LITELLM_SUPPORTED_DB_OBJECTS") | ||
| if raw is None: | ||
| return True | ||
| allowed = {s.strip() for s in raw.split(",") if s.strip()} | ||
| return name in allowed | ||
|
|
||
|
|
||
| def _register_optional_feature_routers(app: FastAPI) -> None: |
There was a problem hiding this comment.
Env var controls routers only — not DB loading — but documented example conflates both
_should_import_feature reads LITELLM_SUPPORTED_DB_OBJECTS to gate router registration, but _should_load_db_object reads general_settings.get("supported_db_objects") from YAML — they are completely separate mechanisms. The docstring's example value "models,cache_settings,anthropic_beta_headers" contains zero feature-router names, so setting LITELLM_SUPPORTED_DB_OBJECTS to that value would silently disable every optional router (guardrails, MCP, agents, SCIM, evals, etc.) while doing nothing to DB-object loading. Users following the docstring example to tune memory would lose endpoint availability with no error or warning.
The gate should use a dedicated env var (e.g. LITELLM_ENABLED_FEATURES) with separate, clearly documented values, or the docstring must show a realistic example that includes the actual router keys (guardrails, mcp, agents, …) rather than DB-object names.
| router as vector_store_router, | ||
| ) | ||
| from litellm.proxy.vector_store_endpoints.management_endpoints import ( | ||
| router as vector_store_management_router, | ||
| ) | ||
| from litellm.proxy.vector_store_files_endpoints.endpoints import ( |
There was a problem hiding this comment.
guardrails_router newly gated; routes return 404 when env var is set without "guardrails"
guardrails_router was unconditionally registered before this PR. It is now gated behind _should_import_feature("guardrails"). Any deployment that sets LITELLM_SUPPORTED_DB_OBJECTS for memory savings (e.g. "mcp" only) will silently remove all guardrail endpoints (/guardrails, /guardrails/{id}, etc.) — even when guardrails are configured in the YAML. The same regression applies to compliance_router, jwt_key_mapping_router, usage_ai_router, config_override_router, and access_group_router, which were also always-on before this PR. No warning is emitted when these routers are skipped.
| app.include_router(ui_discovery_endpoints_router) | ||
| app.include_router(agent_endpoints_router) | ||
| app.include_router(compliance_router) | ||
| app.include_router(a2a_router) | ||
| app.include_router(access_group_router) | ||
|
|
||
| # Mount optional feature routers. Gated by the LITELLM_SUPPORTED_DB_OBJECTS env | ||
| # var — unset means "load everything" (backward compat), set to a CSV means | ||
| # only those features' routers import. Must run at module scope (before any | ||
| # request arrives) so the imports actually stay out of the process. | ||
| _register_optional_feature_routers(app) | ||
|
|
||
|
|
||
| async def _stream_mcp_asgi_response( |
There was a problem hiding this comment.
No startup warning when optional routers are suppressed
When LITELLM_SUPPORTED_DB_OBJECTS is set and some features are skipped, the server starts silently. Operators have no way to know which endpoints are inactive without reading source code. A brief startup log (e.g. logger.info("Optional routers disabled: %s", disabled_list)) would make misconfiguration much easier to diagnose.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| append_agents_to_model_info, | ||
| ) | ||
| from litellm.proxy.analytics_endpoints.analytics_endpoints import ( | ||
| router as analytics_router, | ||
| ) | ||
| from litellm.proxy.anthropic_endpoints.claude_code_endpoints import ( | ||
| claude_code_marketplace_router, | ||
| ) | ||
| from litellm.proxy.anthropic_endpoints.endpoints import router as anthropic_router | ||
| from litellm.proxy.anthropic_endpoints.skills_endpoints import ( | ||
| router as anthropic_skills_router, | ||
| ) | ||
| # claude_code_marketplace_router, anthropic_router, anthropic_skills_router | ||
| # are all gated inside _register_optional_feature_routers |
There was a problem hiding this comment.
global_agent_registry still unconditionally imported while agent_endpoints_router is gated
The agent router is now lazy-loaded when "agents" is in the env var, but global_agent_registry (and append_agents_to_model_group / append_agents_to_model_info from agent_endpoints.model_list_helpers) are still imported at module scope. If the agent framework is the source of significant memory overhead, these unconditional imports limit the savings even when "agents" is excluded.
Relevant issues
Memory usage between 1.86.1 and 1.87.3 spiked from 900mb to around 2gb, this adds flags that can remove some of the new imports (like MCP, agent related imports) that cause the memory spike if the user does not use them.
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewScreenshots / Proof of Fix
Measured RSS memory usage goes down from 2.2gb without flags to ~800-900mb with flags.
Type
🐛 Bug Fix
🚄 Infrastructure
Changes
Changes in proxy_server file