Skip to content

[WIP] Add flags that remove unneeded imports#26294

Open
Michael-RZ-Berri wants to merge 1 commit intolitellm_internal_stagingfrom
litellm_importMemReduction
Open

[WIP] Add flags that remove unneeded imports#26294
Michael-RZ-Berri wants to merge 1 commit intolitellm_internal_stagingfrom
litellm_importMemReduction

Conversation

@Michael-RZ-Berri
Copy link
Copy Markdown
Collaborator

Relevant issues

Memory usage between 1.86.1 and 1.87.3 spiked from 900mb to around 2gb, this adds flags that can remove some of the new imports (like MCP, agent related imports) that cause the memory spike if the user does not use them.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Screenshots / Proof of Fix

Measured RSS memory usage goes down from 2.2gb without flags to ~800-900mb with flags.

Type

🐛 Bug Fix
🚄 Infrastructure

Changes

Changes in proxy_server file

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Michael Riad Zaky seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

This PR defers ~25 optional feature router imports behind a new LITELLM_SUPPORTED_DB_OBJECTS env var check to reduce proxy memory usage (~800 MB savings demonstrated). When the env var is unset, all routers load as before (backwards-compatible); when set to a CSV of feature names, only those routers are imported.

  • The env var name reuses LITELLM_SUPPORTED_DB_OBJECTS, which already has YAML-key semantics for DB-object loading (_should_load_db_object reads general_settings, not this env var). The docstring example values (\"models,cache_settings,anthropic_beta_headers\") are DB-object names, not feature-router names — any deployment using that example would silently suppress all optional endpoints (guardrails, MCP, agents, SCIM, evals, …) with no error.
  • Several routers that were unconditionally registered before this PR — guardrails_router, compliance_router, jwt_key_mapping_router, usage_ai_router, config_override_router, access_group_router — are now gated, so setting the env var for memory savings but omitting these names silently removes those API endpoints.

Confidence Score: 3/5

Not safe to merge — the env var naming/documentation issue can silently disable guardrails and other previously-unconditional endpoints in production.

Two P1 findings: (1) the LITELLM_SUPPORTED_DB_OBJECTS env var is reused with semantics different from the existing YAML key, and the docstring example gives values that are not valid feature-router names, meaning any deployment following the docs would silently lose endpoint availability; (2) guardrails, compliance, jwt-mappings, config-overrides, and access-groups were always registered before this PR and are now silently dropped when the env var is set. The memory-saving approach is sound but needs a distinct env var name or corrected documentation and startup logging before merge.

litellm/proxy/proxy_server.py — _should_import_feature, _register_optional_feature_routers, and their call site at module scope.

Important Files Changed

Filename Overview
litellm/proxy/proxy_server.py Introduces _register_optional_feature_routers to lazy-load ~25 feature routers behind the LITELLM_SUPPORTED_DB_OBJECTS env var, but the env var name conflicts with the existing YAML key semantics, the documented example values are not valid router names, and previously-unconditional routers (guardrails, compliance, jwt-mappings, config-overrides) are silently dropped when the env var is set.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[proxy_server.py module import] --> B{LITELLM_SUPPORTED_DB_OBJECTS set?}
    B -- "No (default)" --> C[_should_import_feature returns True for all]
    B -- "Yes (CSV)" --> D[_should_import_feature checks name in allowed set]
    C --> E[All optional routers imported & registered]
    D --> F{Feature name in CSV?}
    F -- Yes --> G[Router imported & app.include_router called]
    F -- No --> H[Router skipped — endpoints return 404]
    G --> I[FastAPI app ready]
    H --> I
    E --> I

    subgraph "Core routers always loaded"
        J[key_management_router]
        K[team_router]
        L[spend_management_router]
        M[response_router / batches_router]
    end

    subgraph "Previously-unconditional now gated"
        N[guardrails_router]
        O[compliance_router]
        P[jwt_key_mapping_router]
        Q[config_override_router]
        R[usage_ai_router]
    end

    subgraph "New lazy-load targets memory savings"
        T[MCP routers ~300 MB]
        U[agent + a2a routers]
        V[vector_store routers]
        W[anthropic / google / langfuse passthrough]
    end
Loading

Reviews (1): Last reviewed commit: "add flags that remove unneeded imports" | Re-trigger Greptile

Comment on lines +14064 to +14075
LITELLM_SUPPORTED_DB_OBJECTS: "models,cache_settings,anthropic_beta_headers"

to skip importing feature routers for everything else.
"""
raw = os.environ.get("LITELLM_SUPPORTED_DB_OBJECTS")
if raw is None:
return True
allowed = {s.strip() for s in raw.split(",") if s.strip()}
return name in allowed


def _register_optional_feature_routers(app: FastAPI) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Env var controls routers only — not DB loading — but documented example conflates both

_should_import_feature reads LITELLM_SUPPORTED_DB_OBJECTS to gate router registration, but _should_load_db_object reads general_settings.get("supported_db_objects") from YAML — they are completely separate mechanisms. The docstring's example value "models,cache_settings,anthropic_beta_headers" contains zero feature-router names, so setting LITELLM_SUPPORTED_DB_OBJECTS to that value would silently disable every optional router (guardrails, MCP, agents, SCIM, evals, etc.) while doing nothing to DB-object loading. Users following the docstring example to tune memory would lose endpoint availability with no error or warning.

The gate should use a dedicated env var (e.g. LITELLM_ENABLED_FEATURES) with separate, clearly documented values, or the docstring must show a realistic example that includes the actual router keys (guardrails, mcp, agents, …) rather than DB-object names.

Comment on lines +14140 to +14145
router as vector_store_router,
)
from litellm.proxy.vector_store_endpoints.management_endpoints import (
router as vector_store_management_router,
)
from litellm.proxy.vector_store_files_endpoints.endpoints import (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 guardrails_router newly gated; routes return 404 when env var is set without "guardrails"

guardrails_router was unconditionally registered before this PR. It is now gated behind _should_import_feature("guardrails"). Any deployment that sets LITELLM_SUPPORTED_DB_OBJECTS for memory savings (e.g. "mcp" only) will silently remove all guardrail endpoints (/guardrails, /guardrails/{id}, etc.) — even when guardrails are configured in the YAML. The same regression applies to compliance_router, jwt_key_mapping_router, usage_ai_router, config_override_router, and access_group_router, which were also always-on before this PR. No warning is emitted when these routers are skipped.

Comment on lines 14346 to 14355
app.include_router(ui_discovery_endpoints_router)
app.include_router(agent_endpoints_router)
app.include_router(compliance_router)
app.include_router(a2a_router)
app.include_router(access_group_router)

# Mount optional feature routers. Gated by the LITELLM_SUPPORTED_DB_OBJECTS env
# var — unset means "load everything" (backward compat), set to a CSV means
# only those features' routers import. Must run at module scope (before any
# request arrives) so the imports actually stay out of the process.
_register_optional_feature_routers(app)


async def _stream_mcp_asgi_response(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No startup warning when optional routers are suppressed

When LITELLM_SUPPORTED_DB_OBJECTS is set and some features are skipped, the server starts silently. Operators have no way to know which endpoints are inactive without reading source code. A brief startup log (e.g. logger.info("Optional routers disabled: %s", disabled_list)) would make misconfiguration much easier to diagnose.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines 246 to +252
append_agents_to_model_info,
)
from litellm.proxy.analytics_endpoints.analytics_endpoints import (
router as analytics_router,
)
from litellm.proxy.anthropic_endpoints.claude_code_endpoints import (
claude_code_marketplace_router,
)
from litellm.proxy.anthropic_endpoints.endpoints import router as anthropic_router
from litellm.proxy.anthropic_endpoints.skills_endpoints import (
router as anthropic_skills_router,
)
# claude_code_marketplace_router, anthropic_router, anthropic_skills_router
# are all gated inside _register_optional_feature_routers
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 global_agent_registry still unconditionally imported while agent_endpoints_router is gated

The agent router is now lazy-loaded when "agents" is in the env var, but global_agent_registry (and append_agents_to_model_group / append_agents_to_model_info from agent_endpoints.model_list_helpers) are still imported at module scope. If the agent framework is the source of significant memory overhead, these unconditional imports limit the savings even when "agents" is excluded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants