Skip to content

Conversation

@PancakeZik
Copy link

Summary

This PR implements the global model alias feature I proposed in #632. I gave it a go at implementing it myself and would appreciate your consideration.

  • Adds model-aliases configuration section for mapping alias names to provider-specific models
  • Supports round-robin and fill-first routing strategies
  • Automatic fallback to next provider on rate limit (429), service unavailable (503), gateway timeout (504), or bad gateway (502) errors
  • Hot-reload support for alias configuration changes
  • Includes unit and integration tests
  • Added example configuration to config.example.yaml

Use Case

Different providers expose the same model with different names:

  • Antigravity: gemini-claude-opus-4-5-thinking
  • Kiro: kiro-claude-opus-4-5-agentic

With this feature, users can define a single alias (e.g., opus-4.5) that maps to multiple providers. When quota is exhausted on one provider, the request automatically fails over to the next.

Example Configuration

model-aliases:
  default-strategy: round-robin
  aliases:
    - alias: opus-4.5
      strategy: fill-first
      providers:
        - provider: antigravity
          model: gemini-claude-opus-4-5-thinking
        - provider: kiro
          model: kiro-claude-opus-4-5-agentic

Test Plan

  • Unit tests for alias resolver (internal/alias/resolver_test.go)
  • Integration tests for global resolver (internal/alias/integration_test.go)
  • Manual testing with real providers - verified fallback works when quota exhausted

Closes #632

Add a model-aliases configuration section that maps user-friendly alias
names to provider-specific model names. This enables automatic failover
across providers when quota is exhausted.

Features:
- New model-aliases config section with aliases and providers mappings
- Round-robin and fill-first routing strategies
- Automatic fallback on 429, 502, 503, 504 errors
- Hot-reload support for alias configuration changes
- Unit and integration tests for the alias resolver

Closes router-for-me#632
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @PancakeZik, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust global model alias feature, allowing the system to intelligently route requests for a single logical model across multiple underlying providers. It provides automatic failover capabilities based on configurable strategies and error conditions, significantly improving the system's ability to handle provider-specific model variations and temporary service disruptions. The implementation includes hot-reload support for configuration changes, ensuring operational flexibility.

Highlights

  • Global Model Aliases: Introduces a new configuration section for mapping user-friendly alias names to provider-specific models, enabling a single alias to represent multiple backend models.
  • Cross-Provider Fallback: Implements automatic fallback to the next configured provider when the current provider returns specific error codes (429, 502, 503, 504), enhancing reliability and resilience.
  • Routing Strategies: Supports 'round-robin' and 'fill-first' routing strategies for selecting providers within an alias, with a configurable default and per-alias overrides.
  • Hot-Reload Support: The model alias configuration can be hot-reloaded without restarting the server, allowing for dynamic updates to alias mappings.
  • Comprehensive Testing: Includes dedicated unit and integration tests for the alias resolver, along with manual verification of fallback behavior.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature for global model aliases with cross-provider fallback. The implementation is comprehensive, covering configuration, hot-reloading, and core routing logic. However, my review identified a few critical issues in the request handling logic that would prevent the feature from working as intended, along with some areas for improvement in code structure and test coverage. Specifically, there are bugs related to resolving the initial model for an alias and incorrect index handling during fallback. There's also significant code duplication in the fallback implementation. Additionally, the global resolver's initialization can be simplified, and the new resolver logic lacks sufficient unit tests.

// Check if this is an alias with multiple fallback targets
aliasInfo := h.getAliasTargets(modelName)

providers, normalizedModel, metadata, errMsg := h.getRequestDetails(modelName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a critical issue in how the initial request for an alias is handled. The function calls h.getAliasTargets(modelName) but then proceeds to call h.getRequestDetails(modelName) with the original modelName. If modelName is an alias, getRequestDetails will fail because aliases are not registered in the global model registry, causing it to return an "unknown provider" error.

The initial request should use the specific model selected from the alias, not the alias name itself.

Comment on lines 720 to 728
if len(targets) <= 1 {
// No fallback benefit if only one target
return nil
}

return &aliasInfo{
targets: targets,
selectedIdx: selected.Index,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a critical bug in getAliasTargets related to how the selected provider index is handled. SelectProvider returns selected.Index, which is an index into the original resolved.Providers slice. However, the targets slice created in this function is a filtered version of resolved.Providers.

In the fallback loop, aliasInfo.selectedIdx (the original index) is compared with i (an index into the filtered targets slice). This comparison is incorrect and will cause the wrong provider to be skipped during fallback, or no provider to be skipped at all.

The selectedIdx stored in aliasInfo must be the index corresponding to the targets slice.

	if len(targets) <= 1 {
		// No fallback benefit if only one target
		return nil
	}

	// Find the index of the selected provider in the filtered `targets` list.
	finalSelectedIdx := -1
	selectedProvider := resolved.Providers[selected.Index]
	for i, t := range targets {
		if t.provider == selectedProvider.Provider && t.model == selectedProvider.Model {
			finalSelectedIdx = i
			break
		}
	}

	// This should not happen if the selection logic is correct, but as a safeguard:
	if finalSelectedIdx == -1 {
		return nil
	}

	return &aliasInfo{
		targets:     targets,
		selectedIdx: finalSelectedIdx,
	}

Comment on lines 379 to 452
// If we have alias fallback targets and the error is fallback-eligible, try them
if err != nil && aliasInfo != nil && isAliasFallbackEligible(err) {
// Try each target except the one that was already selected
for i, target := range aliasInfo.targets {
if i == aliasInfo.selectedIdx {
continue // skip the one we already tried
}
log.Debugf("alias fallback: trying target %d/%d: provider=%s model=%s",
i+1, len(aliasInfo.targets), target.provider, target.model)

// Get providers for this target's model
targetProviders := util.GetProviderName(target.model)
if len(targetProviders) == 0 {
continue
}

// Update request with target's model
targetReq := coreexecutor.Request{
Model: target.model,
Payload: cloneBytes(rawJSON),
Metadata: cloneMetadata(metadata),
}

resp, err = h.AuthManager.Execute(ctx, targetProviders, targetReq, opts)
if err == nil {
log.Debugf("alias fallback: succeeded with provider=%s model=%s", target.provider, target.model)
return cloneBytes(resp.Payload), nil
}

// If this error is not fallback-eligible, stop trying
if !isAliasFallbackEligible(err) {
break
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback logic for model aliases is duplicated in ExecuteWithAuthManager and ExecuteCountWithAuthManager (lines 460-494). This duplication makes the code harder to maintain and increases the risk of introducing inconsistencies if one block is updated and the other is not. Please refactor this logic into a shared helper function to improve code quality and maintainability.

Comment on lines 9 to 33
var (
globalResolver *Resolver
globalResolverOnce sync.Once
globalResolverMu sync.RWMutex
)

// GetGlobalResolver returns the global alias resolver instance.
// Creates a new empty resolver if not initialized.
func GetGlobalResolver() *Resolver {
globalResolverOnce.Do(func() {
globalResolver = NewResolver(nil)
})
globalResolverMu.RLock()
defer globalResolverMu.RUnlock()
return globalResolver
}

// InitGlobalResolver initializes the global resolver with configuration.
// Should be called during server startup.
func InitGlobalResolver(cfg *config.ModelAliasConfig) {
globalResolverOnce.Do(func() {
globalResolver = NewResolver(cfg)
})
globalResolverMu.Lock()
defer globalResolverMu.Unlock()
if globalResolver != nil && cfg != nil {
globalResolver.Update(cfg)
}
}

// UpdateGlobalResolver updates the global resolver configuration.
// Used for hot-reload.
func UpdateGlobalResolver(cfg *config.ModelAliasConfig) {
r := GetGlobalResolver()
if r != nil && cfg != nil {
r.Update(cfg)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The implementation of the global resolver singleton is overly complex. The use of sync.RWMutex is redundant given that the Resolver type is already internally thread-safe for its operations. The initialization logic in InitGlobalResolver is also confusing and performs a redundant Update call if it's the first function to initialize the resolver.

This can be greatly simplified to improve readability and maintainability by always initializing an empty resolver with sync.Once and then updating it. This removes the need for the global mutex and simplifies the initialization flow.

var (
	globalResolver     *Resolver
	globalResolverOnce sync.Once
)

// GetGlobalResolver returns the global alias resolver instance.
// Creates a new empty resolver if not initialized.
func GetGlobalResolver() *Resolver {
	globalResolverOnce.Do(func() {
		globalResolver = NewResolver(nil)
	})
	return globalResolver
}

// InitGlobalResolver initializes the global resolver with configuration.
// Should be called during server startup.
func InitGlobalResolver(cfg *config.ModelAliasConfig) {
	GetGlobalResolver().Update(cfg)
}

// UpdateGlobalResolver updates the global resolver configuration.
// Used for hot-reload.
func UpdateGlobalResolver(cfg *config.ModelAliasConfig) {
	GetGlobalResolver().Update(cfg)
}

@@ -0,0 +1,120 @@
package alias
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test suite for resolver.go is missing tests for the SelectProvider function. This function contains the core routing logic for round-robin and fill-first strategies, including stateful counter updates, and should be thoroughly tested to prevent regressions. Please add unit tests covering both strategies and edge cases like an empty provider list or providers without available credentials.

@PancakeZik PancakeZik force-pushed the feature/model-aliases branch from 1c43984 to f6f4085 Compare December 28, 2025 22:41
Fixes issues identified by Gemini Code Assist:

- Critical: Use selected model for initial request instead of alias name
  Added selectedModel field to aliasInfo and use it in ExecuteWithAuthManager

- Critical: Fix index mismatch between original providers and filtered targets
  Track selectedIdxInTargets during the filtering loop in getAliasTargets

- High: Refactor duplicated fallback logic into tryAliasFallback helper
  Added executeFn callback type for reusable fallback execution

- Medium: Simplify global resolver singleton
  Removed redundant globalResolverMu mutex since sync.Once handles init
  and Resolver.Update has its own internal mutex

- Medium: Add tests for SelectProvider function
  Added tests for nil input, strategies, counter initialization, and
  fill-first variants
@PancakeZik PancakeZik force-pushed the feature/model-aliases branch from f6f4085 to 59da579 Compare December 28, 2025 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant