-
Notifications
You must be signed in to change notification settings - Fork 3k
Add GPT-5 to Code Eval #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe UI title/header was updated. The default fallback for model1 changed to “Claude Opus 4.1.” The model catalog replaced “Claude Sonnet 4” and “GPT-4.1” with “Claude Opus 4.1” and “GPT-5.” Response routing now sends “GPT” models to OpenAI (OPENAI_API_KEY, max_completion_tokens) and others to OpenRouter (OPENROUTER_API_KEY, max_tokens). Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant App as App UI
participant MS as ModelService
participant OA as OpenAI API
participant OR as OpenRouter API
User->>App: Select models / submit prompt
App->>MS: get_model_response_async(model_name, prompt)
alt model_name contains "GPT"
MS->>OA: acompletion(api_key=OPENAI_API_KEY, max_completion_tokens=2000)
OA-->>MS: response stream
else
MS->>OR: acompletion(api_key=OPENROUTER_API_KEY, max_tokens=2000)
OR-->>MS: response stream
end
MS-->>App: stream tokens / final text
App-->>User: display comparison output
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (2)
code-model-comparison/model_service.py (1)
75-84
: Reflect correct API key environment variable in error messages
Adjust the exception handler incode-model-comparison/model_service.py
(around lines 75–84) so that invalid/missing API key errors referenceOPENAI_API_KEY
when routing to OpenAI andOPENROUTER_API_KEY
otherwise.• File: code-model-comparison/model_service.py
Lines: 75–84except Exception as e: error_msg = f"Error generating response: {str(e)}" - if "api_key" in str(e).lower() or "authentication" in str(e).lower(): - error_msg = "Error: Invalid or missing API key. Please check your OPENROUTER_API_KEY configuration." + if "api_key" in str(e).lower() or "authentication" in str(e).lower(): + # Dynamically reference the correct env var based on model routing + env_var = "OPENAI_API_KEY" if "GPT" in model_name else "OPENROUTER_API_KEY" + error_msg = ( + f"Error: Invalid or missing API key. " + f"Please check your {env_var} configuration." + ) elif "quota" in str(e).lower() or "limit" in str(e).lower(): error_msg = "Error: API quota exceeded or rate limit reached. Please try again later." elif "model" in str(e).lower(): error_msg = f"Error: Model '{model_name}' is not available or has issues. Please try a different model."code-model-comparison/app.py (1)
65-71
: Broken logo URL (double protocol) in “Powered by” strip.The second image src has a duplicated https://, so the Opik logo won’t render.
- <img src="https://files.buildwithfern.com/https://opik.docs.buildwithfern.com/docs/opik/2025-08-01T07:08:31.326Z/img/logo-dark-mode.svg" width="100"> + <img src="https://files.buildwithfern.com/opik.docs.buildwithfern.com/docs/opik/2025-08-01T07:08:31.326Z/img/logo-dark-mode.svg" width="100">
🧹 Nitpick comments (1)
code-model-comparison/app.py (1)
91-94
: Unreachable fallback branch; simplify default selection.Because you stop execution when all_models is empty (Lines 82–85), the else "Claude Opus 4.1" fallback on Line 92 is unreachable. Simplify to avoid config drift.
-if default_model1 not in all_models: - default_model1 = all_models[0] if all_models else "Claude Opus 4.1" +if default_model1 not in all_models: + default_model1 = all_models[0]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
code-model-comparison/app.py
(3 hunks)code-model-comparison/model_service.py
(2 hunks)
🔇 Additional comments (2)
code-model-comparison/app.py (2)
13-13
: Branding/title change LGTM.
62-62
: Main header change LGTM.
"Claude Opus 4.1": "openrouter/anthropic/claude-opus-4.1", | ||
"Qwen3-Coder": "openrouter/qwen/qwen3-coder", | ||
"Gemini 2.5 Flash": "openrouter/google/gemini-2.5-flash", | ||
"GPT-4.1": "openrouter/openai/gpt-4.1", | ||
"GPT-5": "gpt-5", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Model catalog updates look consistent, but verify identifiers exist upstream.
"Claude Opus 4.1" and "GPT-5" entries look fine. Ensure LiteLLM recognizes these identifiers and providers as configured.
Run web checks to confirm model IDs and parameter support:
If needed, I can adjust the mapping to provider-qualified IDs (e.g., openai/gpt-5) or add a fallback route.
🌐 Web query:
Does LiteLLM support the model id "gpt-5" with acompletion, and should we pass max_tokens or max_completion_tokens for streaming?
💡 Result:
Short answer: Yes — LiteLLM supports GPT‑5 model IDs (e.g. "openai/gpt-5" or "azure/gpt-5") for its completion APIs, and for GPT‑5 you should use max_completion_tokens to control the streamed completion length (LiteLLM will also accept max_tokens and map it to max_completion_tokens for GPT‑5, but passing max_completion_tokens is clearer/explicit). (docs.litellm.ai)
Quick notes and example:
- Supported IDs: LiteLLM lists OpenAI/Azure GPT‑5 model IDs such as openai/gpt-5 and openai/gpt-5-chat (and provider-prefixed variants). (docs.litellm.ai)
- Params: LiteLLM exposes both max_tokens and max_completion_tokens; for GPT‑5 it maps max_tokens → max_completion_tokens, and you can (and probably should) pass max_completion_tokens when streaming so you explicitly set the completion token cap. (docs.litellm.ai)
Example (Python):
from litellm import completion
resp = completion(
model="openai/gpt-5",
messages=[{"role":"user","content":"Write a short summary of X."}],
stream=True,
max_completion_tokens=250
)
for chunk in resp:
print(chunk.choices[0].delta)
If you want, tell me which provider/model string you’re using (openai/…, azure/…, or bare "gpt-5") and I’ll give a tailored snippet and confirm the exact param mapping for your LiteLLM version.
Citations:
- 1: https://docs.litellm.ai/release_notes/v1-75-5?utm_source=chatgpt.com
- 2: https://docs.litellm.ai/release_notes/v1-75-5?utm_source=chatgpt.com
- 3: https://docs.litellm.ai/docs/completion/input?utm_source=chatgpt.com
🌐 Web query:
Is "openrouter/anthropic/claude-opus-4.1" a valid model identifier on OpenRouter via LiteLLM?
💡 Result:
Short answer: No. The valid Anthropic Opus identifier on OpenRouter via LiteLLM uses the Claude 3 naming, e.g., openrouter/anthropic/claude-3-opus-20240229, not “claude-opus-4.1.”[1][3]
Details:
- LiteLLM’s OpenRouter convention is openrouter/provider/model-name, and Anthropic’s Opus listing on OpenRouter uses the Claude 3 format such as claude-3-opus-20240229.[1]
- LiteLLM docs state you can call any OpenRouter model by passing model="openrouter/", with examples including openrouter/anthropic/claude-2 and the Claude 3 variants, but there is no “claude-opus-4.1” pattern shown.[3]
- A third-party page shows a string like openrouter/anthropic/claude-opus-4.1, but this appears inconsistent with OpenRouter’s established Claude 3 model identifiers and should not be relied upon without confirmation on the OpenRouter models list.[2]
Recommendation:
- Use the confirmed identifier: openrouter/anthropic/claude-3-opus-20240229.[1]
- If you need the latest Sonnet/Haiku/Opus variant, check the OpenRouter model list and plug it into LiteLLM as openrouter/anthropic/.[3]
Citations:
- 1: https://saptak.in/writing/2025/05/10/google-adk-masterclass-part3
- 2: https://lightning.ai/lightning-purchase-test/studios/compare-openai-gpt-5-and-claude-opus-4-1-for-code-generation
- 3: https://docs.litellm.ai/docs/providers/openrouter
- 4: https://docs.litellm.ai/docs/completion/input
- 5: https://www.bitdoze.com/google-adk-openrouter-models/
- 6: https://python.useinstructor.com/integrations/openrouter/
- 7: https://github.com/BerriAI/litellm/blob/main/README.md
- 8: https://denshub.com/en/choosing-llm-gateway/
- 9: https://www.truefoundry.com/blog/litellm-vs-openrouter
- 10: https://ai.plainenglish.io/power-up-your-ai-agents-unlocking-a-universe-of-models-with-litellm-and-openrouter-7c25262cbbe9
Fix invalid LiteLLM model identifiers and use explicit max_completion_tokens
Please update the model mapping in code-model-comparison/model_service.py
and streaming calls as follows:
-
In the AVAILABLE_MODELS mapping (lines 9–13):
- "Claude Opus 4.1": "openrouter/anthropic/claude-opus-4.1", + "Claude Opus 4.1": "openrouter/anthropic/claude-3-opus-20240229", # valid OpenRouter Anthropic Opus ID - "GPT-5": "gpt-5", + "GPT-5": "openai/gpt-5", # provider-prefixed for LiteLLM compatibility
-
In your GPT-5 streaming completion branches (around lines 49–64):
• Replace any usage ofmax_tokens
withmax_completion_tokens
to explicitly set the streamed token cap (LiteLLM will mapmax_tokens
→max_completion_tokens
but explicit is clearer).resp = completion( model="openai/gpt-5", messages=…, stream=True, - max_tokens=250 + max_completion_tokens=250 )
These changes ensure LiteLLM recognizes both model IDs and correctly applies the streaming token limit.
🤖 Prompt for AI Agents
In code-model-comparison/model_service.py lines 9–13 and the GPT-5 streaming
branches around lines 49–64, update the AVAILABLE_MODELS mapping so the GPT-5
entry uses the valid LiteLLM identifier ("openai/gpt-5") instead of "gpt-5", and
in all GPT-5 stream calls replace max_tokens with max_completion_tokens (e.g.,
max_completion_tokens=250) so the streamed token cap is set explicitly; keep the
rest of the call arguments the same.
if "GPT" in model_name: | ||
response = await acompletion( | ||
model=model_mapping, | ||
messages=messages, | ||
api_key=os.getenv("OPENAI_API_KEY"), | ||
max_completion_tokens=2000, | ||
stream=True, | ||
) | ||
else: | ||
response = await acompletion( | ||
model=model_mapping, | ||
messages=messages, | ||
api_key=os.getenv("OPENROUTER_API_KEY"), | ||
max_tokens=2000, | ||
stream=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use max_tokens for LiteLLM acompletion; align params across routes.
The GPT branch passes max_completion_tokens, which acompletion may not accept. Use max_tokens in both branches to avoid parameter mismatch and parity issues.
Apply this minimal diff:
- if "GPT" in model_name:
- response = await acompletion(
- model=model_mapping,
- messages=messages,
- api_key=os.getenv("OPENAI_API_KEY"),
- max_completion_tokens=2000,
- stream=True,
- )
+ if "GPT" in model_name:
+ response = await acompletion(
+ model=model_mapping,
+ messages=messages,
+ api_key=os.getenv("OPENAI_API_KEY"),
+ max_tokens=2000,
+ stream=True,
+ )
Optionally, dedupe branches by deriving api_key and calling acompletion once:
- if "GPT" in model_name:
- response = await acompletion(
- model=model_mapping,
- messages=messages,
- api_key=os.getenv("OPENAI_API_KEY"),
- max_tokens=2000,
- stream=True,
- )
- else:
- response = await acompletion(
- model=model_mapping,
- messages=messages,
- api_key=os.getenv("OPENROUTER_API_KEY"),
- max_tokens=2000,
- stream=True,
- )
+ is_gpt = "GPT" in model_name
+ response = await acompletion(
+ model=model_mapping,
+ messages=messages,
+ api_key=os.getenv("OPENAI_API_KEY") if is_gpt else os.getenv("OPENROUTER_API_KEY"),
+ max_tokens=2000,
+ stream=True,
+ )
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if "GPT" in model_name: | |
response = await acompletion( | |
model=model_mapping, | |
messages=messages, | |
api_key=os.getenv("OPENAI_API_KEY"), | |
max_completion_tokens=2000, | |
stream=True, | |
) | |
else: | |
response = await acompletion( | |
model=model_mapping, | |
messages=messages, | |
api_key=os.getenv("OPENROUTER_API_KEY"), | |
max_tokens=2000, | |
stream=True, | |
) | |
if "GPT" in model_name: | |
response = await acompletion( | |
model=model_mapping, | |
messages=messages, | |
api_key=os.getenv("OPENAI_API_KEY"), | |
max_tokens=2000, | |
stream=True, | |
) | |
else: | |
response = await acompletion( | |
model=model_mapping, | |
messages=messages, | |
api_key=os.getenv("OPENROUTER_API_KEY"), | |
max_tokens=2000, | |
stream=True, | |
) |
🤖 Prompt for AI Agents
In code-model-comparison/model_service.py around lines 49 to 64, the GPT branch
passes max_completion_tokens while the other branch uses max_tokens, causing
acompletion parameter mismatch; change the GPT branch to use max_tokens=2000
(matching the other branch) and align other parameter names, and optionally
simplify by setting api_key = os.getenv("OPENAI_API_KEY") if "GPT" in model_name
else os.getenv("OPENROUTER_API_KEY") and call acompletion once with the derived
api_key and shared args (model=model_mapping, messages=messages,
api_key=api_key, max_tokens=2000, stream=True).
Summary by CodeRabbit
New Features
Refactor
Style