Skip to content

[Bug]: Cannot get past 50 RPS #6592

@vutrung96

Description

@vutrung96

What happened?

I have OpenAI tier 5 usage, which should give me 30,000 RPM = 500 RPS with "gpt-4o-mini". However I struggle get past 50 RPS.

The minimal replication:

from litellm import acompletion

tasks = [acompletion(
    model="gpt-4o-mini",
    messages=[
      {"role": "system", "content": "You're an agent who answers yes or no"},
      {"role": "user", "content": "Is the sky blue?"},
    ],
) for i in range(2000)]

I only get 50 items/second as opposed to ~500 items/second when sending raw HTTP requests.

Relevant log output

 16%|█████████████████████▌                                                                                                                 | 320/2000 [00:09<00:40, 41.49it/s]

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingstale

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions