429 handling inconsistent across providers → retry amplification under concurrency

### Checked other resources

- [x] This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
- [x] I added a very descriptive title to this issue.
- [x] I searched the LangChain.js documentation with the integrated search.
- [x] I used the GitHub search to find a similar question and didn't find it.
- [x] I am sure that this is a bug in LangChain.js rather than my code.
- [x] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

### Example Code

```typescript
// Current behavior in errors.ts:45 (langchain-anthropic)
if (e.status === 429) {
  // No Retry-After inspection
  // No distinction between transient vs quota exhaustion
  throw new RateLimitError(e);
}

// Missing: semantic classification before retry
// Retry-After: 30  → WAIT (retry after delay)
// Retry-After: 600 → STOP (quota exhaustion, do not retry)
// No header        → CAP (reduce concurrency first)
```


### Error Message and Stack Trace (if applicable)

```markdown
429 "You have exceeded the 5-hour usage quota. It will reset at [time]"
→ retried repeatedly
→ same message processed 90+ times
→ fallback never triggered
```

### Description

Ran `pitstop-check` against the provider error handlers and found the same pattern across all four providers:

- `libs/providers/langchain-anthropic/src/utils/errors.ts:45` — 429 without `Retry-After`
- `libs/providers/langchain-openai/src/utils/client.ts:54` — 429 without `Retry-After`
- `libs/providers/langchain-google/src/utils/errors.ts:191` — 429 without `Retry-After`, CAP vs WAIT not distinguished
- `libs/providers/langchain-google/src/chat_models/base.ts:648` — no max elapsed bound
- `libs/providers/langchain-openrouter/src/utils/errors.ts:44` — 429 without `Retry-After`, CAP vs WAIT not distinguished
- `libs/providers/langchain-openrouter/src/chat_models/index.ts:478` — no max elapsed bound

Different 429 cases require different handling:
- short-lived pressure → retry after delay
- concurrency pressure → reduce parallelism before retry
- quota / billing exhaustion → do not retry

When collapsed into one branch, retries can't succeed and systems loop or amplify load.

Seen in a live system: quota-window 429 retried 90+ times, fallback never triggered.

Static check: `github.com/SirBrenton/pitstop-check`

cc @hntrl @christian-bromann (via @jacoblee93)

### System Info

Static analysis only — no runtime reproduction required.
Pattern confirmed in current main branch (cloned 2026-03-31).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

429 handling inconsistent across providers → retry amplification under concurrency #10566

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

429 handling inconsistent across providers → retry amplification under concurrency #10566

Description

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions