-
-
Notifications
You must be signed in to change notification settings - Fork 636
feat: add global model aliases with cross-provider fallback #765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add a model-aliases configuration section that maps user-friendly alias names to provider-specific model names. This enables automatic failover across providers when quota is exhausted. Features: - New model-aliases config section with aliases and providers mappings - Round-robin and fill-first routing strategies - Automatic fallback on 429, 502, 503, 504 errors - Hot-reload support for alias configuration changes - Unit and integration tests for the alias resolver Closes router-for-me#632
Summary of ChangesHello @PancakeZik, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust global model alias feature, allowing the system to intelligently route requests for a single logical model across multiple underlying providers. It provides automatic failover capabilities based on configurable strategies and error conditions, significantly improving the system's ability to handle provider-specific model variations and temporary service disruptions. The implementation includes hot-reload support for configuration changes, ensuring operational flexibility. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for global model aliases with cross-provider fallback. The implementation is comprehensive, covering configuration, hot-reloading, and core routing logic. However, my review identified a few critical issues in the request handling logic that would prevent the feature from working as intended, along with some areas for improvement in code structure and test coverage. Specifically, there are bugs related to resolving the initial model for an alias and incorrect index handling during fallback. There's also significant code duplication in the fallback implementation. Additionally, the global resolver's initialization can be simplified, and the new resolver logic lacks sufficient unit tests.
sdk/api/handlers/handlers.go
Outdated
| // Check if this is an alias with multiple fallback targets | ||
| aliasInfo := h.getAliasTargets(modelName) | ||
|
|
||
| providers, normalizedModel, metadata, errMsg := h.getRequestDetails(modelName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a critical issue in how the initial request for an alias is handled. The function calls h.getAliasTargets(modelName) but then proceeds to call h.getRequestDetails(modelName) with the original modelName. If modelName is an alias, getRequestDetails will fail because aliases are not registered in the global model registry, causing it to return an "unknown provider" error.
The initial request should use the specific model selected from the alias, not the alias name itself.
| if len(targets) <= 1 { | ||
| // No fallback benefit if only one target | ||
| return nil | ||
| } | ||
|
|
||
| return &aliasInfo{ | ||
| targets: targets, | ||
| selectedIdx: selected.Index, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a critical bug in getAliasTargets related to how the selected provider index is handled. SelectProvider returns selected.Index, which is an index into the original resolved.Providers slice. However, the targets slice created in this function is a filtered version of resolved.Providers.
In the fallback loop, aliasInfo.selectedIdx (the original index) is compared with i (an index into the filtered targets slice). This comparison is incorrect and will cause the wrong provider to be skipped during fallback, or no provider to be skipped at all.
The selectedIdx stored in aliasInfo must be the index corresponding to the targets slice.
if len(targets) <= 1 {
// No fallback benefit if only one target
return nil
}
// Find the index of the selected provider in the filtered `targets` list.
finalSelectedIdx := -1
selectedProvider := resolved.Providers[selected.Index]
for i, t := range targets {
if t.provider == selectedProvider.Provider && t.model == selectedProvider.Model {
finalSelectedIdx = i
break
}
}
// This should not happen if the selection logic is correct, but as a safeguard:
if finalSelectedIdx == -1 {
return nil
}
return &aliasInfo{
targets: targets,
selectedIdx: finalSelectedIdx,
}| // If we have alias fallback targets and the error is fallback-eligible, try them | ||
| if err != nil && aliasInfo != nil && isAliasFallbackEligible(err) { | ||
| // Try each target except the one that was already selected | ||
| for i, target := range aliasInfo.targets { | ||
| if i == aliasInfo.selectedIdx { | ||
| continue // skip the one we already tried | ||
| } | ||
| log.Debugf("alias fallback: trying target %d/%d: provider=%s model=%s", | ||
| i+1, len(aliasInfo.targets), target.provider, target.model) | ||
|
|
||
| // Get providers for this target's model | ||
| targetProviders := util.GetProviderName(target.model) | ||
| if len(targetProviders) == 0 { | ||
| continue | ||
| } | ||
|
|
||
| // Update request with target's model | ||
| targetReq := coreexecutor.Request{ | ||
| Model: target.model, | ||
| Payload: cloneBytes(rawJSON), | ||
| Metadata: cloneMetadata(metadata), | ||
| } | ||
|
|
||
| resp, err = h.AuthManager.Execute(ctx, targetProviders, targetReq, opts) | ||
| if err == nil { | ||
| log.Debugf("alias fallback: succeeded with provider=%s model=%s", target.provider, target.model) | ||
| return cloneBytes(resp.Payload), nil | ||
| } | ||
|
|
||
| // If this error is not fallback-eligible, stop trying | ||
| if !isAliasFallbackEligible(err) { | ||
| break | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback logic for model aliases is duplicated in ExecuteWithAuthManager and ExecuteCountWithAuthManager (lines 460-494). This duplication makes the code harder to maintain and increases the risk of introducing inconsistencies if one block is updated and the other is not. Please refactor this logic into a shared helper function to improve code quality and maintainability.
| var ( | ||
| globalResolver *Resolver | ||
| globalResolverOnce sync.Once | ||
| globalResolverMu sync.RWMutex | ||
| ) | ||
|
|
||
| // GetGlobalResolver returns the global alias resolver instance. | ||
| // Creates a new empty resolver if not initialized. | ||
| func GetGlobalResolver() *Resolver { | ||
| globalResolverOnce.Do(func() { | ||
| globalResolver = NewResolver(nil) | ||
| }) | ||
| globalResolverMu.RLock() | ||
| defer globalResolverMu.RUnlock() | ||
| return globalResolver | ||
| } | ||
|
|
||
| // InitGlobalResolver initializes the global resolver with configuration. | ||
| // Should be called during server startup. | ||
| func InitGlobalResolver(cfg *config.ModelAliasConfig) { | ||
| globalResolverOnce.Do(func() { | ||
| globalResolver = NewResolver(cfg) | ||
| }) | ||
| globalResolverMu.Lock() | ||
| defer globalResolverMu.Unlock() | ||
| if globalResolver != nil && cfg != nil { | ||
| globalResolver.Update(cfg) | ||
| } | ||
| } | ||
|
|
||
| // UpdateGlobalResolver updates the global resolver configuration. | ||
| // Used for hot-reload. | ||
| func UpdateGlobalResolver(cfg *config.ModelAliasConfig) { | ||
| r := GetGlobalResolver() | ||
| if r != nil && cfg != nil { | ||
| r.Update(cfg) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of the global resolver singleton is overly complex. The use of sync.RWMutex is redundant given that the Resolver type is already internally thread-safe for its operations. The initialization logic in InitGlobalResolver is also confusing and performs a redundant Update call if it's the first function to initialize the resolver.
This can be greatly simplified to improve readability and maintainability by always initializing an empty resolver with sync.Once and then updating it. This removes the need for the global mutex and simplifies the initialization flow.
var (
globalResolver *Resolver
globalResolverOnce sync.Once
)
// GetGlobalResolver returns the global alias resolver instance.
// Creates a new empty resolver if not initialized.
func GetGlobalResolver() *Resolver {
globalResolverOnce.Do(func() {
globalResolver = NewResolver(nil)
})
return globalResolver
}
// InitGlobalResolver initializes the global resolver with configuration.
// Should be called during server startup.
func InitGlobalResolver(cfg *config.ModelAliasConfig) {
GetGlobalResolver().Update(cfg)
}
// UpdateGlobalResolver updates the global resolver configuration.
// Used for hot-reload.
func UpdateGlobalResolver(cfg *config.ModelAliasConfig) {
GetGlobalResolver().Update(cfg)
}| @@ -0,0 +1,120 @@ | |||
| package alias | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test suite for resolver.go is missing tests for the SelectProvider function. This function contains the core routing logic for round-robin and fill-first strategies, including stateful counter updates, and should be thoroughly tested to prevent regressions. Please add unit tests covering both strategies and edge cases like an empty provider list or providers without available credentials.
1c43984 to
f6f4085
Compare
Fixes issues identified by Gemini Code Assist: - Critical: Use selected model for initial request instead of alias name Added selectedModel field to aliasInfo and use it in ExecuteWithAuthManager - Critical: Fix index mismatch between original providers and filtered targets Track selectedIdxInTargets during the filtering loop in getAliasTargets - High: Refactor duplicated fallback logic into tryAliasFallback helper Added executeFn callback type for reusable fallback execution - Medium: Simplify global resolver singleton Removed redundant globalResolverMu mutex since sync.Once handles init and Resolver.Update has its own internal mutex - Medium: Add tests for SelectProvider function Added tests for nil input, strategies, counter initialization, and fill-first variants
f6f4085 to
59da579
Compare
Summary
This PR implements the global model alias feature I proposed in #632. I gave it a go at implementing it myself and would appreciate your consideration.
model-aliasesconfiguration section for mapping alias names to provider-specific modelsconfig.example.yamlUse Case
Different providers expose the same model with different names:
gemini-claude-opus-4-5-thinkingkiro-claude-opus-4-5-agenticWith this feature, users can define a single alias (e.g.,
opus-4.5) that maps to multiple providers. When quota is exhausted on one provider, the request automatically fails over to the next.Example Configuration
Test Plan
internal/alias/resolver_test.go)internal/alias/integration_test.go)Closes #632