FallbackModel and Provider/Client SDK Retry Behavior might be conflicting (no hint in pydantic-ai docs)

### Initial Checks

- [x] I confirm that I'm using the latest version of Pydantic AI
- [x] I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue

### Description

### Summary
The [FallbackModel documentation](https://ai.pydantic.dev/models/overview/#fallback-model) doesn't mention that underlying provider SDKs (like OpenAI SDK) (might) have built-in retry logic that can significantly delay or prevent the fallback model from being triggered. This led to unexpected behavior where rate limit errors (429) would retry for up to 60 seconds before the fallback activated, rather than immediately switching models.

### Current Documentation
The current documentation states:
> "By default, the FallbackModel only moves on to the next model if the current model raises a ModelHTTPError. You can customize this behavior by passing a custom `fallback_on` argument to the `FallbackModel` constructor."

And shows this example:
```python
openai_model = OpenAIChatModel('gpt-4o')
anthropic_model = AnthropicModel('claude-3-5-sonnet-latest')
fallback_model = FallbackModel(openai_model, anthropic_model)

agent = Agent(fallback_model)
response = agent.run_sync('What is the capital of France?')
```

### The Problem
When following this example, users may experience:
1. **Rate limit errors (429) retry for extended periods** instead of immediately falling back to the secondary model
2. **Console output showing retry attempts**: `{"event": "Retrying request to /chat/completions in 60.000000 seconds"}`
3. **Delayed fallback activation** due to the OpenAI SDK's default `max_retries=2` behavior

This happens because:
- The OpenAI SDK has `DEFAULT_MAX_RETRIES = 2` built-in
- On 429 errors, it respects the `Retry-After` header (up to 60 seconds)
- These retries happen **before** the `FallbackModel` ever sees the error
- The `FallbackModel` only activates after all SDK-level retries are exhausted

### Expected Behavior
Users expect that when using `FallbackModel`, rate limit errors would immediately trigger a fallback to the secondary model, not wait through multiple retry attempts.

### Solution
The issue is resolved by configuring the provider client to disable retries (which is not super nice solution, but it get's the job done)

```python
import openai

# Create OpenAI client with retries disabled
openai_client = openai.AsyncAzureOpenAI(
    api_key=settings.OPENAI_API_KEY,
    azure_endpoint=settings.OPENAI_API_BASE,
    api_version=settings.OPENAI_API_VERSION,
    max_retries=0,  # Critical: disable SDK-level retries
)

openai_model = OpenAIChatModel(
    'gpt-4o',
    provider=AzureProvider(openai_client=openai_client)
)

anthropic_model = AnthropicModel('claude-3-5-sonnet-latest')
fallback_model = FallbackModel(openai_model, anthropic_model)
```

### Suggested Pydantic-AI Docs Improvements
I suggest adding a section to the [FallbackModel documentation](https://ai.pydantic.dev/models/overview/#fallback-model) that covers:
1. **Provider SDK Retry Behavior**: Mention that provider SDKs often have built-in retry logic
2. **Disabling Retries for Immediate Fallback**: Show how to configure `max_retries=0` for common providers
3. **Rate Limit Handling**: Explain that rate limits may retry for extended periods if not configured properly
4. **Best Practices**: Recommend disabling provider-level retries when using `FallbackModel` to ensure immediate fallback (if thats what is expected). 

### Related
This affects all provider integrations that wrap SDKs with built-in retry logic, not just OpenAI. Consider adding similar guidance for other providers in their respective documentation pages.

### References
- [FallbackModel Overview](https://ai.pydantic.dev/models/overview/#fallback-model)
- [FallbackModel API Reference](https://ai.pydantic.dev/api/models/fallback/#pydantic_ai.models.fallback.FallbackModel.__init__)
- [OpenAI SDK Constants showing DEFAULT_MAX_RETRIES](https://github.com/openai/openai-python/blob/main/src/openai/_constants.py)

### Example Code

```Python

```

### Python, Pydantic AI & LLM client version

```Text
"pydantic-ai==1.1.0"
"pydantic==2.12.3"
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FallbackModel and Provider/Client SDK Retry Behavior might be conflicting (no hint in pydantic-ai docs) #3267

Initial Checks

Description

Summary

Current Documentation

The Problem

Expected Behavior

Solution

Suggested Pydantic-AI Docs Improvements

Related

References

Example Code

Python, Pydantic AI & LLM client version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FallbackModel and Provider/Client SDK Retry Behavior might be conflicting (no hint in pydantic-ai docs) #3267

Description

Initial Checks

Description

Summary

Current Documentation

The Problem

Expected Behavior

Solution

Suggested Pydantic-AI Docs Improvements

Related

References

Example Code

Python, Pydantic AI & LLM client version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions