Skip to content

429 handling inconsistent across providers → retry amplification under concurrency #10566

@SirBrenton

Description

Checked other resources

  • This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

// Current behavior in errors.ts:45 (langchain-anthropic)
if (e.status === 429) {
  // No Retry-After inspection
  // No distinction between transient vs quota exhaustion
  throw new RateLimitError(e);
}

// Missing: semantic classification before retry
// Retry-After: 30  → WAIT (retry after delay)
// Retry-After: 600 → STOP (quota exhaustion, do not retry)
// No header        → CAP (reduce concurrency first)

Error Message and Stack Trace (if applicable)

429 "You have exceeded the 5-hour usage quota. It will reset at [time]"
→ retried repeatedly
→ same message processed 90+ times
→ fallback never triggered

Description

Ran pitstop-check against the provider error handlers and found the same pattern across all four providers:

  • libs/providers/langchain-anthropic/src/utils/errors.ts:45 — 429 without Retry-After
  • libs/providers/langchain-openai/src/utils/client.ts:54 — 429 without Retry-After
  • libs/providers/langchain-google/src/utils/errors.ts:191 — 429 without Retry-After, CAP vs WAIT not distinguished
  • libs/providers/langchain-google/src/chat_models/base.ts:648 — no max elapsed bound
  • libs/providers/langchain-openrouter/src/utils/errors.ts:44 — 429 without Retry-After, CAP vs WAIT not distinguished
  • libs/providers/langchain-openrouter/src/chat_models/index.ts:478 — no max elapsed bound

Different 429 cases require different handling:

  • short-lived pressure → retry after delay
  • concurrency pressure → reduce parallelism before retry
  • quota / billing exhaustion → do not retry

When collapsed into one branch, retries can't succeed and systems loop or amplify load.

Seen in a live system: quota-window 429 retried 90+ times, fallback never triggered.

Static check: github.com/SirBrenton/pitstop-check

cc Hunter Lovell (@hntrl) Christian Bromann (@christian-bromann) (via Jacob Lee (@jacoblee93))

System Info

Static analysis only — no runtime reproduction required.
Pattern confirmed in current main branch (cloned 2026-03-31).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions