Skip to content

Conversation

@Pratham-Mishra04
Copy link
Collaborator

Summary

Briefly explain the purpose of this PR and the problem it solves.

Changes

  • What was changed and why
  • Any notable design decisions or trade-offs

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Describe the steps to validate this change. Include commands and expected outcomes.

# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build

If adding new configs or environment variables, document them here.

Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

Breaking changes

  • Yes
  • No

If yes, describe impact and migration instructions.

Related issues

Link related issues and discussions. Example: Closes #123

Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 10, 2025

📝 Walkthrough

Summary by CodeRabbit

  • Tests

    • Added a large, comprehensive governance test suite: end-to-end, integration, and unit tests covering budgets (VK/team/customer/provider), rate limits, in-memory sync, usage tracking, provider routing/fallbacks, deletion cascades, concurrency, and edge cases.
    • Added test scaffolding, fixtures, mock logger, and helpers for API interactions and cost calculations.
  • Chores

    • Added test dependencies and updated module metadata.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds a large suite of governance tests and test scaffolding: new unit, integration, and end-to-end tests exercising budgets, rate limits, provider routing, in-memory sync, cascade deletions, test fixtures/utilities, changelog, and go.mod test dependencies.

Changes

Cohort / File(s) Summary
Test utilities & scaffolding
plugins/governance/test_utils.go, plugins/governance/fixtures_test.go, plugins/governance/test_utils.go
New test helpers and fixtures: model cost registry & CalculateCost, API test primitives (APIRequest/APIResponse, MakeRequest), ID/error helpers, GlobalTestData lifecycle/cleanup, polling helpers, and a thread-safe MockLogger with builders/assertion helpers. Many exported test types and helpers added.
End-to-end & advanced scenarios
plugins/governance/e2e_test.go, plugins/governance/advanced_scenarios_test.go, plugins/governance/edge_cases_test.go
Large E2E scenarios exercising hierarchical budgets, VK/team/customer/provider switching after exhaustion, provider routing/fallback/load‑balancing, concurrency, cascade deletions, and complex multi‑tier budget cases; multiple public test entry points added.
Budget enforcement tests
plugins/governance/vk_budget_test.go, plugins/governance/team_budget_test.go, plugins/governance/customer_budget_test.go, plugins/governance/provider_budget_test.go
Tests creating VKs/teams/customers/provider budgets and issuing chat-completion requests to exhaust budgets; validate budget-exceeded behavior and log per-request consumption.
Rate limit tests
plugins/governance/rate_limit_test.go, plugins/governance/rate_limit_enforcement_test.go
VK- and provider-level token/request rate-limit tests, enforcement and combined scenarios, in-memory usage tracking, and reset/verifications.
In-memory sync & config update tests
plugins/governance/in_memory_sync_test.go, plugins/governance/config_update_sync_test.go, plugins/governance/usage_tracking_test.go
Tests validating async propagation of budgets/rate-limits/provider configs into in-memory structures, reset semantics, and persistence/usage tracking across requests.
Resolver, store, and tracker tests
plugins/governance/resolver_test.go, plugins/governance/store_test.go, plugins/governance/tracker_test.go
Unit/integration tests for BudgetResolver evaluation, governance store operations (concurrency, budget checks, resets), and UsageTracker behaviors (streaming optimization, failed-request handling).
Sync / advanced orchestration tests
plugins/governance/config_update_sync_test.go, plugins/governance/in_memory_sync_test.go, plugins/governance/advanced_scenarios_test.go
Parallelized integration tests covering updates to budgets/rate limits/provider configs and verifying synchronized in-memory state and update/reset semantics.
Changelog & go.mod
plugins/governance/changelog.md, plugins/governance/go.mod
Adds changelog entries and updates go.mod with test dependencies (e.g., github.com/stretchr/testify) and some indirect entries.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

"I nibbled tokens in the midnight hush,
hopped through VK lanes with a quieter rush,
budgets stacked like carrots in a row,
deletions left a trail where soft winds blow,
a rabbit's test—small paws, bright nose." 🐇

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is entirely empty—only the template is present with all sections blank and unchecked, providing no concrete information about the changes made. Fill in the PR description with a summary of the governance tests added, explain the test coverage objectives, specify affected areas (Plugins/Governance), and check relevant checklist items.
Linked Issues check ⚠️ Warning The PR adds comprehensive governance tests but is linked to issue #123 (Files API Support), which is entirely unrelated to governance test coverage and concerns provider file upload functionality. Either link the PR to relevant governance-related issues or remove the incorrect link to #123; clarify the actual testing objectives in the PR description.
Out of Scope Changes check ⚠️ Warning The entire PR scope appears misaligned: all changes are comprehensive governance tests, but the linked issue #123 concerns Files API (unrelated feature). This suggests either incorrect linking or scope mismatch. Review and correct the linked issue to match the governance testing scope, or clarify if governance tests are prerequisites for the Files API feature work.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'chore: governance tests added' is concise, clear, and accurately describes the primary change—adding comprehensive test coverage for the governance plugin.
Docstring Coverage ✅ Passed Docstring coverage is 86.32% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 12-09-chore_governance_tests_added

Comment @coderabbitai help to get the list of available commands and usage tips.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from 4ca3be1 to 51b5738 Compare December 10, 2025 13:37
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🧹 Nitpick comments (21)
plugins/governance/test_utils.go (3)

142-151: Deprecated rand.Seed usage.

rand.Seed is deprecated since Go 1.20. The global random generator is now automatically seeded. You can simplify this by removing the seed call, or better yet, use math/rand/v2 for new code.

 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

90-92: Consider adding a timeout to the HTTP client.

Tests can hang indefinitely if the server at localhost:8080 is unresponsive. Adding a timeout improves test reliability.

 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)

247-267: Unused keyPath parameter.

The keyPath parameter is declared but never used in the function body. Either remove it or use it to make the function more flexible.

-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse) string {

Alternatively, use keyPath to search a specific key instead of iterating through the hardcoded list.

plugins/governance/provider_budget_test.go (1)

57-144: Consider extracting duplicated subtest logic into a helper function.

The OpenAIProviderBudgetExceeded and AnthropicProviderBudgetExceeded subtests contain nearly identical code (~90 lines each). Extracting a helper like runProviderBudgetTest(t, vkValue, model, providerBudget) would reduce duplication and improve maintainability.

Example structure:

func runProviderBudgetTest(t *testing.T, vkValue, model string, providerBudget float64) {
    // shared loop logic with model parameter
}

t.Run("OpenAIProviderBudgetExceeded", func(t *testing.T) {
    runProviderBudgetTest(t, vkValue, "openai/gpt-4o", 0.01)
})

t.Run("AnthropicProviderBudgetExceeded", func(t *testing.T) {
    runProviderBudgetTest(t, vkValue, "anthropic/claude-3-7-sonnet-20250219", 0.01)
})

Also applies to: 147-234

plugins/governance/config_update_sync_test.go (1)

91-91: Fixed sleeps may cause flaky tests under load.

The 500ms sleep duration assumes sync completes within that window. Under CI load or slow network conditions, this could cause intermittent failures. Consider implementing a polling helper with timeout.

// Example polling helper
func waitForCondition(t *testing.T, check func() bool, timeout time.Duration) bool {
    deadline := time.Now().Add(timeout)
    for time.Now().Before(deadline) {
        if check() {
            return true
        }
        time.Sleep(50 * time.Millisecond)
    }
    return false
}

Also applies to: 134-134, 183-183

plugins/governance/customer_budget_test.go (1)

77-94: Duplicated long prompt string across multiple test files.

The longPrompt string (~15 lines) is duplicated across vk_budget_test.go, provider_budget_test.go, and customer_budget_test.go. Consider extracting to a constant in test_utils.go.

// In test_utils.go
const TestLongPrompt = "Please provide a comprehensive and detailed response..."

Also applies to: 252-269

plugins/governance/tracker_test.go (3)

15-16: Mismatched comment and function name.

The comment says "tests successful usage tracking" but the function is named TestUsageTracker_UpdateUsage_FailedRequest. Update the comment to match the test's actual purpose:

-// TestUsageTracker_UpdateUsage_Successful tests successful usage tracking
+// TestUsageTracker_UpdateUsage_FailedRequest tests that failed requests don't update budgets
 func TestUsageTracker_UpdateUsage_FailedRequest(t *testing.T) {

76-82: The assertion assert.True(t, true) provides no value.

This assertion always passes and doesn't verify any behavior. If the goal is to ensure the code doesn't crash, the test already achieves that by reaching this point. Remove the no-op assertion:

 	tracker.UpdateUsage(context.Background(), update)

 	time.Sleep(100 * time.Millisecond)
-	// Just verify it doesn't crash
-	assert.True(t, true)
+	// Test passes if no panic occurred
 }

197-198: Mismatched comment and function name.

The comment refers to TestUsageTracker_UpdateBudgetHierarchy but the function is TestUsageTracker_Cleanup:

-// TestUsageTracker_UpdateBudgetHierarchy tests multi-level budget updates
+// TestUsageTracker_Cleanup tests that the tracker cleanup succeeds
 func TestUsageTracker_Cleanup(t *testing.T) {
plugins/governance/in_memory_sync_test.go (1)

402-403: Consider reducing the sleep duration for delete sync verification.

The 2-second sleep is relatively long compared to other tests (500ms). If this delay is necessary due to async delete propagation, consider documenting why, or if it's overly conservative, reduce it for faster test execution:

 	// Verify in-memory store is updated
-	time.Sleep(2 * time.Second)
+	time.Sleep(500 * time.Millisecond) // Allow time for async delete propagation
plugins/governance/edge_cases_test.go (2)

103-120: String conversion produces unexpected characters for request numbers ≥10.

string(rune('0'+requestNum%10)) converts to ASCII characters. For requestNum 0-9, this works correctly ('0'-'9'). However, this is embedded in a prompt string and requestNum goes up to 50, so it will cycle through digits 0-9 repeatedly. While not a bug (since the prompt content doesn't affect the test logic), using fmt.Sprintf would be clearer:

-		"Request number " + string(rune('0'+requestNum%10)) + ". " +
+		fmt.Sprintf("Request number %d. ", requestNum) +

Note: This requires adding "fmt" to imports.


170-179: Potential off-by-one: shouldStop check occurs after incrementing requestNum.

The shouldStop flag is checked after requestNum++, meaning one additional request may be attempted after the budget threshold is crossed. This is likely intentional to verify the next request fails, but the logic flow is subtle. Consider restructuring for clarity:

 		requestNum++

-		if shouldStop {
-			break
-		}
-
 		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
+			// Budget threshold reached, next iteration should fail
+			continue
 		}
 	}

Or alternatively, break immediately when the threshold is hit if you don't need the extra iteration.

plugins/governance/usage_tracking_test.go (3)

8-129: Test requires ~35 seconds to complete due to reset timer wait.

This test waits 35 seconds for the rate limit ticker to reset. While necessary for testing the actual reset behavior, consider documenting this in the test name or adding a build tag to exclude from fast/unit test runs:

-// TestUsageTrackingRateLimitReset tests that rate limit resets happen correctly on ticker
+// TestUsageTrackingRateLimitReset tests that rate limit resets happen correctly on ticker.
+// Note: This test takes ~35 seconds due to waiting for the reset duration.
 func TestUsageTrackingRateLimitReset(t *testing.T) {
+	if testing.Short() {
+		t.Skip("Skipping slow reset test in short mode")
+	}

100-111: Unused response: getDataResp is fetched but not used.

The response at line 101-104 is retrieved but no assertions are made on it before the 35-second sleep. Either add assertions to verify the rate limit state after the request, or remove this unnecessary API call:

-	// Get rate limit data after request
-	getDataResp := MakeRequest(t, APIRequest{
-		Method: "GET",
-		Path:   "/api/governance/data",
-	})
-
-	if getDataResp.StatusCode != 200 {
-		t.Fatalf("Failed to get governance data: status %d", getDataResp.StatusCode)
-	}
-
 	// Rate limit counter should have been updated
 	t.Logf("Rate limit should be tracking usage in in-memory store")

393-394: Same string conversion pattern as noted in edge_cases_test.go.

Consider using fmt.Sprintf for clearer request numbering in prompts:

-					Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+					Content: fmt.Sprintf("Test request %d for reset ticker test.", i),

Also applies to: 512-513

plugins/governance/rate_limit_enforcement_test.go (2)

68-70: Consider simplifying the loop control logic.

The shouldStop variable pattern adds unnecessary complexity. The loop already has exit conditions via break and the iteration limit.

-	consumedTokens := int64(0)
-	requestNum := 1
-	shouldStop := false
-
-	for requestNum <= 20 {
+	consumedTokens := int64(0)
+	for requestNum := 1; requestNum <= 20; requestNum++ {
 		resp := MakeRequest(t, APIRequest{
 			// ...
 		})
 		// ... handle response ...
-
-		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedTokens >= tokenLimit {
-			shouldStop = true
+		if consumedTokens >= tokenLimit {
+			// Next iteration will trigger rejection
+			continue
 		}
 	}

Also applies to: 119-126


585-586: Fixed sleep duration may cause test flakiness.

The 1-second sleep for async updates could be insufficient under load or in slower CI environments. Consider using a polling mechanism with timeout instead.

-	// Wait for async update
-	time.Sleep(1 * time.Second)
+	// Wait for async update with polling
+	deadline := time.Now().Add(5 * time.Second)
+	for time.Now().Before(deadline) {
+		// Check if rate limit data is updated
+		getDataResp := MakeRequest(t, APIRequest{
+			Method: "GET",
+			Path:   "/api/governance/data",
+		})
+		if getDataResp.StatusCode == 200 {
+			// Verify data is present and break
+			break
+		}
+		time.Sleep(100 * time.Millisecond)
+	}

Alternatively, consider extracting a helper like waitForAsyncUpdate(t, timeout) to standardize this pattern across tests.

plugins/governance/advanced_scenarios_test.go (2)

88-129: Extract budget exhaustion loop into a reusable helper.

This budget exhaustion pattern is duplicated across 7+ tests in this file and appears similarly in e2e_test.go. Consider extracting a helper function to reduce duplication and improve maintainability.

Add to test_utils.go:

// ExhaustBudget makes requests until budget is exceeded, returns consumed amount
func ExhaustBudget(t *testing.T, vkValue string, budgetLimit float64, maxRequests int) float64 {
    consumedBudget := 0.0
    for i := 1; i <= maxRequests; i++ {
        resp := MakeRequest(t, APIRequest{
            Method: "POST",
            Path:   "/v1/chat/completions",
            Body: ChatCompletionRequest{
                Model: "openai/gpt-4o",
                Messages: []ChatMessage{
                    {Role: "user", Content: "Hello how are you?"},
                },
            },
            VKHeader: &vkValue,
        })

        if resp.StatusCode >= 400 {
            if CheckErrorMessage(t, resp, "budget") {
                return consumedBudget
            }
            t.Fatalf("Request %d failed with unexpected error: %v", i, resp.Body)
        }

        if usage, ok := resp.Body["usage"].(map[string]interface{}); ok {
            if prompt, ok := usage["prompt_tokens"].(float64); ok {
                if completion, ok := usage["completion_tokens"].(float64); ok {
                    cost, _ := CalculateCost("openai/gpt-4o", int(prompt), int(completion))
                    consumedBudget += cost
                }
            }
        }
    }
    return consumedBudget
}

150-151: Consolidate async wait pattern.

time.Sleep(500 * time.Millisecond) appears 12+ times across this file for waiting on in-memory updates. Consider defining a constant or helper function to make this consistent and easier to tune.

const asyncUpdateWait = 500 * time.Millisecond

// Or better, a helper with optional polling:
func waitForInMemoryUpdate(t *testing.T) {
    time.Sleep(asyncUpdateWait)
}
plugins/governance/e2e_test.go (2)

1010-1014: Use a named constant for the post-reset budget threshold.

The magic number 0.01 assumes a single request costs approximately $0.003. This assumption may break if model pricing changes or if the test model is updated.

+	// Expected cost of one gpt-4o request (with margin for response variability)
+	const maxSingleRequestCost = 0.01
+
 	// The usage after reset should be approximately the cost of one request (~$0.003)
-	if usageAfterReset > 0.01 { // Allow some margin for actual LLM response
+	if usageAfterReset > maxSingleRequestCost {
 		t.Fatalf("Budget usage after reset too high: $%.6f (expected ~$0.003 for one request)", usageAfterReset)
 	}

654-656: Consider marking this as a long-running test.

The 15-second reset duration plus wait time makes this test take 16+ seconds. For faster CI feedback, consider either:

  1. Using a shorter reset duration if the system supports it (e.g., "5s")
  2. Adding a build tag to run this test only in extended test suites
+// TestRateLimitResetBoundaryConditions is a long-running test that verifies rate limit resets.
+// Consider using -short flag to skip in quick test runs.
 func TestRateLimitResetBoundaryConditions(t *testing.T) {
+	if testing.Short() {
+		t.Skip("Skipping long-running rate limit reset test in short mode")
+	}
 	t.Parallel()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d81e9d and 51b5738.

📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/tracker_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/changelog.md
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/e2e_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/tracker_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/e2e_test.go
🧬 Code graph analysis (12)
plugins/governance/provider_budget_test.go (1)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (306-312)
  • APIRequest (75-80)
  • CreateVirtualKeyRequest (154-163)
  • BudgetRequest (176-179)
  • ProviderConfigRequest (166-173)
  • ExtractIDFromResponse (247-267)
  • ChatCompletionRequest (232-238)
  • ChatMessage (241-244)
  • CheckErrorMessage (271-291)
  • CalculateCost (63-72)
plugins/governance/team_budget_test.go (1)
plugins/governance/test_utils.go (8)
  • MakeRequest (90-140)
  • APIRequest (75-80)
  • CreateTeamRequest (182-186)
  • BudgetRequest (176-179)
  • ExtractIDFromResponse (247-267)
  • CreateVirtualKeyRequest (154-163)
  • CheckErrorMessage (271-291)
  • CalculateCost (63-72)
plugins/governance/resolver_test.go (7)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (68-87)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (67-74)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
core/schemas/context.go (1)
  • BifrostContext (30-40)
ui/lib/types/governance.ts (5)
  • VirtualKey (61-80)
  • RateLimit (13-25)
  • Budget (5-11)
  • Customer (37-44)
  • Team (27-35)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (101-101)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/in_memory_sync_test.go (2)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (306-312)
  • MakeRequest (90-140)
  • APIRequest (75-80)
  • CreateVirtualKeyRequest (154-163)
  • BudgetRequest (176-179)
  • ExtractIDFromResponse (247-267)
  • UpdateVirtualKeyRequest (209-217)
  • UpdateBudgetRequest (195-198)
  • CreateTeamRequest (182-186)
  • UpdateTeamRequest (220-223)
  • CreateCustomerRequest (189-192)
  • UpdateCustomerRequest (226-229)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (306-312)
  • MakeRequest (90-140)
  • APIRequest (75-80)
  • CreateCustomerRequest (189-192)
  • BudgetRequest (176-179)
  • ExtractIDFromResponse (247-267)
  • CreateTeamRequest (182-186)
  • CreateVirtualKeyRequest (154-163)
  • ProviderConfigRequest (166-173)
  • ChatCompletionRequest (232-238)
  • ChatMessage (241-244)
  • CheckErrorMessage (271-291)
  • CalculateCost (63-72)
plugins/governance/fixtures_test.go (3)
core/schemas/logger.go (2)
  • LogLevel (6-6)
  • LoggerOutputType (17-17)
ui/lib/types/governance.ts (3)
  • Budget (5-11)
  • RateLimit (13-25)
  • VirtualKey (61-80)
plugins/governance/resolver.go (1)
  • Decision (15-15)
plugins/governance/rate_limit_test.go (2)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (306-312)
  • MakeRequest (90-140)
  • APIRequest (75-80)
  • CreateVirtualKeyRequest (154-163)
  • CreateRateLimitRequest (201-206)
  • ExtractIDFromResponse (247-267)
  • ChatCompletionRequest (232-238)
  • ChatMessage (241-244)
  • CheckErrorMessage (271-291)
  • ProviderConfigRequest (166-173)
  • UpdateVirtualKeyRequest (209-217)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
plugins/governance/test_utils.go (4)
ui/components/header.tsx (1)
  • Header (4-14)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (13-13)
  • Content (922-930)
plugins/governance/config_update_sync_test.go (2)
plugins/governance/test_utils.go (3)
  • NewGlobalTestData (306-312)
  • MakeRequest (90-140)
  • APIRequest (75-80)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
plugins/governance/advanced_scenarios_test.go (1)
plugins/governance/test_utils.go (17)
  • MakeRequest (90-140)
  • APIRequest (75-80)
  • CreateTeamRequest (182-186)
  • BudgetRequest (176-179)
  • ExtractIDFromResponse (247-267)
  • CreateVirtualKeyRequest (154-163)
  • ChatCompletionRequest (232-238)
  • ChatMessage (241-244)
  • CheckErrorMessage (271-291)
  • CalculateCost (63-72)
  • UpdateVirtualKeyRequest (209-217)
  • CreateCustomerRequest (189-192)
  • UpdateBudgetRequest (195-198)
  • UpdateTeamRequest (220-223)
  • UpdateCustomerRequest (226-229)
  • ProviderConfigRequest (166-173)
  • CreateRateLimitRequest (201-206)
plugins/governance/store_test.go (3)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (68-87)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (67-74)
plugins/governance/e2e_test.go (1)
plugins/governance/test_utils.go (15)
  • NewGlobalTestData (306-312)
  • MakeRequest (90-140)
  • APIRequest (75-80)
  • CreateTeamRequest (182-186)
  • BudgetRequest (176-179)
  • ExtractIDFromResponse (247-267)
  • CreateVirtualKeyRequest (154-163)
  • ChatCompletionRequest (232-238)
  • ChatMessage (241-244)
  • CheckErrorMessage (271-291)
  • CalculateCost (63-72)
  • CreateCustomerRequest (189-192)
  • ProviderConfigRequest (166-173)
  • UpdateVirtualKeyRequest (209-217)
  • CreateRateLimitRequest (201-206)
🪛 LanguageTool
plugins/governance/changelog.md

[grammar] ~1-~1: Ensure spelling is correct
Context: ...factor: extracted governance store into and interface for extensibility - refactor:...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (36)
plugins/governance/changelog.md (1)

1-3: Clarify changelog scope relative to the PR stack.

The changelog entries on lines 1–2 describe refactoring work ("extracted governance store" and "extended rate limits handling"), but the PR description and AI summary state that this PR contains test-only changes with no production code modifications.

Per the coding guidelines, this PR is part of a stack where #1041 (this PR) depends on #1020. The refactors likely belong in #1020, not here. Verify that:

  1. The refactor entries (lines 1–2) should actually appear in the changelog for #1020 (the refactoring PR), not #1041 (the test PR).
  2. This changelog should only document the addition of tests (line 3).
  3. The entries are correctly attributed to the right PR in the stack.
plugins/governance/test_utils.go (1)

329-366: LGTM!

The cleanup order (VKs → Teams → Customers) correctly handles hierarchical dependencies, and gracefully ignores 404 errors for already-deleted resources.

plugins/governance/store_test.go (6)

17-66: LGTM!

Well-structured table-driven test covering active, inactive, and non-existent virtual key retrieval scenarios.


68-101: LGTM!

Good concurrency stress test. The atomic counters correctly track reads and errors across goroutines.


103-163: LGTM!

Good boundary testing for budget validation including the edge case where usage equals the limit.


165-204: LGTM!

Good test of hierarchical budget validation across VK, Team, and Customer levels.


246-338: LGTM!

Reset and retrieval tests are well-structured with clear setup, action, and verification phases.


340-347: LGTM!

Standard pointer helper utilities for test data construction.

plugins/governance/rate_limit_test.go (8)

8-79: LGTM!

Good test for VK-level token rate limiting with proper cleanup. Minor note: string(rune('0'+i)) only works correctly for single digits (i < 10), which is fine here but could be replaced with fmt.Sprintf for robustness.


81-152: LGTM!

Good coverage of request-based rate limiting at the VK level.


154-310: LGTM!

Good coverage of provider-level token and request rate limits.


312-378: LGTM!

Verifies that multiple providers can be configured with independent rate limits. Consider adding a follow-up test that makes requests to both providers to verify usage is tracked independently.


380-454: LGTM!

Good verification that both VK-level and provider-level rate limits can coexist.


456-590: LGTM!

Comprehensive test of rate limit update propagation to in-memory store. The 500ms sleep is acceptable for integration tests though polling with timeout would be more robust.


592-813: LGTM!

Good end-to-end tests verifying both token and request usage tracking in the in-memory store.


815-979: LGTM!

Good verification that provider-level rate limits are tracked independently in the in-memory store.

plugins/governance/team_budget_test.go (2)

7-61: LGTM!

Good test design with team budget as the limiting factor ($0.01) while VK budgets are high ($1.0), ensuring the test exercises team-level budget enforcement.


65-159: LGTM!

The test logic correctly exercises team budget enforcement by making requests until the server rejects due to budget exhaustion. The shouldStop flag appropriately allows verification that the server enforces the limit.

plugins/governance/go.mod (1)

10-10: Testify v1.11.1 is the latest stable release.

The dependency is already up-to-date with the current stable version.

plugins/governance/vk_budget_test.go (1)

8-130: Test logic is sound for verifying VK budget enforcement.

The test correctly:

  • Creates a VK with a small budget
  • Makes requests until budget is exceeded server-side
  • Validates that at least one request succeeded before rejection
  • Uses shouldStop to ensure one additional request after local budget tracking shows exceeded (to verify server enforcement)
plugins/governance/provider_budget_test.go (1)

8-55: Test setup correctly validates per-provider budget enforcement.

The test creates a VK with:

  • High overall budget ($1.0) to ensure it's not the limiting factor
  • Low per-provider budgets ($0.01 each) for OpenAI and Anthropic

This correctly isolates provider-specific budget enforcement testing.

plugins/governance/config_update_sync_test.go (2)

12-213: Comprehensive coverage for VK rate limit and budget update synchronization.

TestVKRateLimitUpdateSyncToMemory and TestVKBudgetUpdateSyncToMemory correctly verify:

  • Initial state propagation to in-memory store
  • Usage tracking after requests
  • Usage reset when new limit < current usage
  • Non-reset when new limit > current usage

The test scenarios align with expected governance behavior.


541-1062: Team, customer, and provider budget sync tests follow consistent patterns.

The remaining tests (TestTeamBudgetUpdateSyncToMemory, TestCustomerBudgetUpdateSyncToMemory, TestProviderBudgetUpdateSyncToMemory) maintain consistent structure with the VK tests and provide good coverage for hierarchical budget synchronization.

plugins/governance/customer_budget_test.go (2)

7-159: Test correctly validates shared customer budget across multiple VKs.

TestCustomerBudgetExceededWithMultipleVKs:

  • Creates customer with $0.01 budget
  • Creates 2 VKs with high individual budgets ($1.0) under the customer
  • Alternates requests between VKs to ensure customer budget is the limiting factor
  • Validates budget enforcement applies across all VKs under the customer

161-334: Test correctly validates shared customer budget across multiple teams.

TestCustomerBudgetExceededWithMultipleTeams:

  • Creates customer with $0.01 budget
  • Creates 2 teams with high budgets under the customer
  • Creates VKs under each team
  • Validates budget enforcement cascades through the team hierarchy
plugins/governance/fixtures_test.go (3)

14-66: Thread-safe MockLogger implementation is well-designed.

The MockLogger correctly:

  • Uses mutex for thread-safe log collection
  • Implements all schemas.Logger interface methods
  • Intentionally uses no-op for configuration methods (appropriate for testing)
  • Routes Fatal to errors slice rather than terminating (allowing test continuation)

68-184: Test data builders provide clean fixture construction.

The builder functions follow a consistent pattern and support various test scenarios:

  • Basic entity construction (buildVirtualKey, buildBudget, buildRateLimit)
  • Composite construction with associations (buildVirtualKeyWithBudget, buildProviderConfigWithRateLimit)
  • Entity hierarchy support (buildTeam, buildCustomer with optional budgets)

195-221: Test assertion helpers are functional but minimal.

The helpers assertDecision, assertVirtualKeyFound, and assertRateLimitInfo provide good semantic naming. The requireNoError/requireError wrappers are thin but add slight verbosity over direct require.NoError(t, err, msg) calls.

plugins/governance/in_memory_sync_test.go (2)

9-126: Well-structured in-memory sync test with proper resource cleanup.

The test follows good practices: parallel execution, unique resource names via generateRandomID(), proper cleanup with defer testData.Cleanup(t), and verification of both creation and update propagation to the in-memory store.


520-524: Reasonable handling of parallel test count discrepancies.

Using a warning log instead of failing when counts differ due to parallel test execution is pragmatic. This acknowledges the test environment constraints without masking actual bugs.

plugins/governance/edge_cases_test.go (1)

58-78: Verify budget hierarchy: VK budget ($0.01) is smaller than provider budget ($0.08).

The VK budget is set to $0.01 while the provider budget is $0.08. The comment at line 96 states "Provider budget should be the limiting factor", but the VK budget is actually more restrictive. This may cause the test to fail earlier than expected on the VK budget rather than the provider budget.

If the intent is to test provider-level enforcement specifically, consider making the VK budget larger than the provider budget:

 	// Create a VK with even tighter budget and provider-specific budgets
-	vkBudget := 0.01
+	vkBudget := 0.10 // VK budget larger than provider budget to ensure provider budget is hit first
plugins/governance/resolver_test.go (3)

15-37: Good test coverage for the happy path with clear assertions.

The test properly verifies that an allowed request returns DecisionAllow and that the virtual key is found. The use of helper functions (buildVirtualKey, assertDecision, assertVirtualKeyFound) improves readability.


375-400: Good edge case: testing when both token and request limits are exceeded simultaneously.

This test verifies that when both limits are at max, the system returns a combined DecisionRateLimited status. The assertion at line 399 properly checks that the reason contains "rate limit".


512-550: Comprehensive context population test.

The test verifies that governance identifiers (VK ID, team ID, customer ID) are properly set in the BifrostContext after evaluation. This is important for downstream processing and logging.

plugins/governance/e2e_test.go (2)

1-8: Good test structure and documentation.

The file is well-organized with clear section headers and comprehensive test coverage across governance scenarios including shared budgets, hierarchy enforcement, concurrency, and provider fallback. The use of t.Parallel() and GlobalTestData cleanup pattern is consistent.


589-595: Update test to check for the actual error message content.

The test checks for "blocked" in the error message, but the actual error reason is "Virtual key is inactive". The word "blocked" only appears in the decision Type field ("virtual_key_blocked"), not in the user-facing error message. Change the check from CheckErrorMessage(t, resp2, "blocked") to CheckErrorMessage(t, resp2, "inactive") to verify the actual message returned to users, consistent with how other tests verify budget, rate limit, and token limit errors.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-08-feat_governance_plugin_refactor branch from 5d81e9d to 0483cbe Compare December 10, 2025 18:04
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch 2 times, most recently from bafb048 to da59879 Compare December 11, 2025 06:05
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-08-feat_governance_plugin_refactor branch from 0483cbe to 687d705 Compare December 11, 2025 06:05
@akshaydeo akshaydeo mentioned this pull request Dec 11, 2025
18 tasks
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (6)
plugins/governance/config_update_sync_test.go (1)

54-62: Harden nested JSON access with helpers instead of chained type assertions.

Throughout this file you do chained assertions like:

data := resp.Body["data"].(map[string]interface{})
vkData := data["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})
rateLimit := data["rate_limits"].(map[string]interface{})[rateLimitID].(map[string]interface{})

If the response shape changes (or an earlier call fails and returns an error payload), these will panic instead of failing with a clear test message. A small helper layer would make these tests more robust and readable, for example:

func getMap(t *testing.T, m map[string]interface{}, key string) map[string]interface{} {
	t.Helper()
	v, ok := m[key]
	if !ok {
		t.Fatalf("key %q not found in map", key)
	}
	mv, ok := v.(map[string]interface{})
	if !ok {
		t.Fatalf("key %q is not a map[string]interface{}", key)
	}
	return mv
}

Then use it (and similar helpers for strings/floats) wherever you traverse data["virtual_keys"], data["rate_limits"], data["budgets"], etc., so failures are deterministic and self‑describing instead of panics.

Also applies to: 99-107, 142-150, 190-199

plugins/governance/provider_budget_test.go (1)

57-145: Loop early‑exit prevents exercising provider‑budget enforcement (same pattern as team budget test).

Both subtests set shouldStop once consumedBudget >= providerBudget and then break on the next iteration, which can stop the loop before any over‑budget request is sent. That means you may never see the expected “provider budget exceeded” error and will hit the final t.Fatalf even when enforcement is correct.

Recommend removing the shouldStop flag and its checks, and relying on the 50‑request cap plus the API’s own budget enforcement (as suggested in TestTeamBudgetExceededWithMultipleVKs), then applying that change consistently across both subtests.

Also applies to: 148-235

plugins/governance/customer_budget_test.go (1)

66-160: Adjust loop termination so customer‑budget tests actually hit the enforced limit.

Both customer‑budget tests stop iterating once consumedBudget >= customerBudget by flipping shouldStop and breaking on the next iteration. This can prevent you from ever sending the next request that should be rejected by governance, leading to the final t.Fatalf("never hit customer budget limit") even when enforcement is working.

As with the team/provider tests, drop the shouldStop flag and its checks, and just rely on the 50‑request cap plus the server’s own budget enforcement; consumedBudget can remain for logging only.

Also applies to: 241-335

plugins/governance/rate_limit_enforcement_test.go (1)

602-611: Incomplete verification of usage tracking.

The test extracts rate_limit_id but doesn't verify that usage counters were actually incremented after the request. This was flagged in a previous review and appears to still need the fix.

 	data := getDataResp.Body["data"].(map[string]interface{})
 	virtualKeysMap := data["virtual_keys"].(map[string]interface{})
 	vkData := virtualKeysMap[vkValue].(map[string]interface{})
 	rateLimitID, _ := vkData["rate_limit_id"].(string)

 	if rateLimitID != "" {
-		t.Logf("Rate limit %s is configured and tracking usage ✓", rateLimitID)
+		rateLimitsMap := data["rate_limits"].(map[string]interface{})
+		if rl, ok := rateLimitsMap[rateLimitID].(map[string]interface{}); ok {
+			tokenUsage, _ := rl["token_current_usage"].(float64)
+			if tokenUsage < float64(tokensUsed) {
+				t.Fatalf("Token usage not tracked: expected >= %d, got %.0f", tokensUsed, tokenUsage)
+			}
+			t.Logf("Rate limit %s tracked %d tokens ✓", rateLimitID, int(tokenUsage))
+		}
 	} else {
 		t.Logf("Rate limit is configured ✓")
 	}
plugins/governance/e2e_test.go (2)

809-836: MakeRequest uses t.Fatalf internally which is unsafe in goroutines.

While individual test assertions were addressed, MakeRequest itself calls t.Fatalf on HTTP client errors or JSON marshal failures. If any goroutine triggers these code paths, it will only exit that goroutine (not fail the test), potentially causing hangs or false positives.

Consider wrapping with error collection:

+	var errors []string
+	var errMu sync.Mutex
+
 	for i := 0; i < numGoroutines; i++ {
 		wg.Add(1)
 		go func(goID int) {
 			defer wg.Done()
+			defer func() {
+				if r := recover(); r != nil {
+					errMu.Lock()
+					errors = append(errors, fmt.Sprintf("goroutine %d panicked: %v", goID, r))
+					errMu.Unlock()
+				}
+			}()
 			for j := 0; j < requestsPerGoroutine; j++ {
 				// ... request logic
 			}
 		}(i)
 	}
 
 	wg.Wait()
+
+	if len(errors) > 0 {
+		t.Fatalf("Concurrent test errors: %v", errors)
+	}

1354-1363: Provider detection logic is fragile.

The string matching at Line 1356 checks for "gpt-4o" to identify OpenAI, but both OpenAI and Azure support gpt-4o, making this detection unreliable. The previous review suggested using explicit provider information from ExtraFields.Provider or parsing the model field by splitting on "/".

Since this was previously flagged as addressed but the code appears unchanged, verify the expected response format matches this detection logic.

#!/bin/bash
# Check if there's a provider field in response or if model format includes provider prefix
rg -n --type=go 'ExtraFields|extra_fields|provider.*response' plugins/governance/ -A3 | head -50
🧹 Nitpick comments (12)
plugins/governance/tracker_test.go (1)

15-54: Align test comment with failed‑request behavior.

The comment says this test covers “successful usage tracking”, but the implementation explicitly asserts that a failed request (Success: false) does not update the budget. Please rename or reword the comment to match the actual scenario being tested.

plugins/governance/fixtures_test.go (1)

14-66: Consider formatting log messages in MockLogger if you plan to assert on them.

Right now the mock logger stores only the raw format string and drops args, so any %v placeholders won’t be expanded. That’s fine as a lightweight stub, but if future tests start asserting on log contents, you may want to fmt.Sprintf(format, args...) before appending.

plugins/governance/in_memory_sync_test.go (1)

375-416: Add status checks before type assertions in VK delete in‑memory sync test.

Unlike the other tests in this file, TestInMemorySyncVirtualKeyDelete doesn’t verify that the /api/governance/data calls returned 200 before asserting getDataResp.Body["data"].(map[string]interface{}). If the endpoint errors, the test will panic rather than fail with a clear diagnostic.

For consistency and clearer failures, add StatusCode checks (and t.Fatalf on non‑200) before performing the type assertions on both getDataResp and getDataResp2.

plugins/governance/usage_tracking_test.go (1)

100-128: Test does not verify rate limit usage was actually tracked before reset.

The test fetches governance data after the request (Lines 100-108) but only logs a message without actually asserting that the rate limit counter was updated. Consider verifying the token usage counter increased before waiting for reset:

 	// Rate limit counter should have been updated
-	t.Logf("Rate limit should be tracking usage in in-memory store")
+	data := getDataResp.Body["data"].(map[string]interface{})
+	rateLimitsMap := data["rate_limits"].(map[string]interface{})
+	if rl, ok := rateLimitsMap[rateLimitID].(map[string]interface{}); ok {
+		if tokenUsage, ok := rl["token_current_usage"].(float64); ok {
+			if tokenUsage < float64(tokensUsed) {
+				t.Fatalf("Token usage not tracked: expected >= %d, got %.0f", tokensUsed, tokenUsage)
+			}
+			t.Logf("Rate limit tracking %d tokens ✓", int(tokenUsage))
+		}
+	}
plugins/governance/rate_limit_test.go (1)

74-79: Test may pass without hitting rate limit.

If requests succeed but don't consume enough tokens to hit the limit, the test logs success at Line 75 without verifying the rate limit was actually enforced. This pattern allows the test to pass even if rate limiting is broken.

Consider adding a minimum assertion:

 	if successCount > 0 {
-		t.Logf("Made %d successful requests before hitting token limit ✓", successCount)
+		// If we made all 10 requests without hitting limit, enforcement may not be working
+		if successCount >= 10 {
+			t.Logf("Warning: Made all %d requests without hitting token limit - limit may not be enforced", successCount)
+		} else {
+			t.Logf("Made %d successful requests before hitting token limit ✓", successCount)
+		}
 	} else {
 		t.Skip("Could not make requests to test token limit")
 	}
plugins/governance/advanced_scenarios_test.go (1)

92-129: Consider extracting budget exhaustion loop to a helper function.

The budget exhaustion pattern is repeated across 7+ tests with nearly identical code. Extract this to a reusable helper:

func exhaustBudget(t *testing.T, vkValue string, targetBudget float64, model string) (float64, int) {
    consumedBudget := 0.0
    requestNum := 1
    for requestNum <= 150 {
        resp := MakeRequest(t, APIRequest{
            Method: "POST",
            Path:   "/v1/chat/completions",
            Body: ChatCompletionRequest{
                Model:    model,
                Messages: []ChatMessage{{Role: "user", Content: "Hello how are you?"}},
            },
            VKHeader: &vkValue,
        })
        if resp.StatusCode >= 400 {
            if CheckErrorMessage(t, resp, "budget") {
                return consumedBudget, requestNum
            }
            t.Fatalf("Request %d failed with unexpected error: %v", requestNum, resp.Body)
        }
        // ... extract cost logic
        requestNum++
    }
    return consumedBudget, requestNum
}

This would significantly reduce code duplication and improve maintainability.

Also applies to: 257-293, 459-491, 571-603, 706-738, 859-891, 995-1027

plugins/governance/test_utils.go (6)

15-73: Model cost helpers are clear; consider clarifying intent of pricing values.

The ModelCost/TestModels + CalculateCost flow is straightforward and correct for tests. Since the per‑token prices and limits are real‑world values that may change, you might want to document that these are test fixtures (not canonical pricing) or wire them to a single source of truth if you expect frequent updates.


143-152: Avoid reseeding the global PRNG on every generateRandomID call.

Calling rand.Seed on each invocation is unnecessary and can reduce randomness under high call rates or concurrency. Seeding once at package init (or using a dedicated rand.Rand) is simpler and more idiomatic.

For example:

-func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
-	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
+func init() {
+	rand.Seed(time.Now().UnixNano())
+}
+
+func generateRandomID() string {
+	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
}

154-245: DTO shapes for governance requests look consistent with the UI types.

The various Create*/Update* request structs (budgets, rate limits, VKs, teams, customers, chat) align well with the governance TS types (e.g., max_limit, reset_duration, token/request limits) and use pointers appropriately for optional/patch semantics. No issues from a shape/typing standpoint. If drift with the core schemas becomes a concern later, consider consolidating on shared types instead of re‑defining them here.


247-268: ExtractIDFromResponse ignores keyPath and hard‑codes resource keys.

The helper works for virtual_key, team, and customer, but the keyPath argument is unused, which is confusing and invites misuse. Either:

  • Remove keyPath from the signature and keep this as a narrowly scoped helper for those three resources; or
  • Implement keyPath (e.g., dot‑separated navigation like "virtual_key.id" / "team.id") and drop the hard‑coded parts slice.

Given this is used across many tests, pick one approach and update call sites so the intent is explicit.


270-297: Error‑message helper is solid; case‑insensitive matching fix looks good.

CheckErrorMessage plus the updated contains now provide flexible, case‑insensitive substring checks across message, error, and raw body, which is useful for resilient tests. To improve debuggability, you might consider logging the body when expectedText isn’t found, or have this helper call t.Helper() so any subsequent assertions attribute failures to the calling test. Otherwise, this looks good, and the prior case‑sensitivity concern appears resolved.


299-367: GlobalTestData/Cleanup pattern is reasonable; consider helper annotation and idempotence.

The bookkeeping of VK/team/customer IDs and the tolerant Cleanup (only logging on 4xx≠404) is a good fit for end‑to‑end governance tests. Two small optional tweaks:

  • Add t.Helper() at the start of Cleanup so logs and any future failures point to the test using it.
  • If these slices can grow across subtests, you might optionally clear them at the end of Cleanup to make repeated calls clearly idempotent.

Functionality-wise, this is fine as is.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51b5738 and da59879.

📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/governance/vk_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/store_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/tracker_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/changelog.md
  • plugins/governance/team_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/customer_budget_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/customer_budget_test.go
🧬 Code graph analysis (5)
plugins/governance/config_update_sync_test.go (1)
plugins/governance/test_utils.go (14)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/rate_limit_enforcement_test.go (5)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (13-13)
  • Content (922-930)
plugins/governance/rate_limit_test.go (2)
plugins/governance/test_utils.go (8)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
plugins/governance/advanced_scenarios_test.go (1)
plugins/governance/test_utils.go (16)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
  • ProviderConfigRequest (167-174)
  • CreateRateLimitRequest (202-207)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (9)
plugins/governance/go.mod (1)

10-10: Testify dependency additions align with new governance test suite.

The direct addition of github.com/stretchr/testify v1.11.1 and its transitive dependencies (go-spew, go-difflib) are appropriate for a comprehensive test suite. Testify is the idiomatic assertion/mocking library for Go and these indirect dependencies are expected transitive inclusions.

Since the PR adds extensive governance tests (per the AI summary), please verify that testify v1.11.1 is the latest stable version, has no known security advisories, and does not conflict with testify versions pinned in other modules (github.com/maximhq/bifrost/core or github.com/maximhq/bifrost/framework). You can verify by checking PyPI or the Go module registry, or running go mod graph | grep testify to inspect the resolved dependency tree.

Also applies to: 38-38, 81-81

plugins/governance/changelog.md (1)

1-3: Changelog entries read well and match the governance test refactor.

The bullets accurately summarize the store refactor, rate-limit handling, and added e2e tests; no changes needed.

plugins/governance/rate_limit_enforcement_test.go (1)

1-6: LGTM on file structure and imports.

The test file is well-organized with proper package declaration, minimal imports, and parallel test execution enabled throughout.

plugins/governance/rate_limit_test.go (2)

456-590: Well-implemented rate limit sync verification.

This test properly verifies that rate limit changes sync to the in-memory store, including checking:

  • Rate limit ID persistence after update
  • Token max limit value update
  • Token reset duration persistence
  • Usage counter existence

Good use of assertions with clear failure messages.


665-813: Comprehensive usage tracking verification.

This test properly verifies both token and request usage counters are incremented after a request, and confirms max limits are preserved. The comparison logic and assertions are thorough.

plugins/governance/advanced_scenarios_test.go (1)

1-6: LGTM on test organization.

The file is well-organized with clear section headers for each scenario, proper parallel execution, and comprehensive cleanup via defer testData.Cleanup(t).

plugins/governance/e2e_test.go (3)

14-181: Comprehensive shared budget fairness test.

This test properly verifies that when multiple VKs share a team budget:

  1. VK1 can exhaust the shared budget
  2. VK2 is correctly rejected when the shared budget is exhausted

The test includes proper cost tracking and clear assertions.


360-495: Well-designed failed request budget verification.

This test correctly verifies that:

  1. Failed requests (invalid model) don't consume budget
  2. Successful requests do consume budget

The tolerance value (0.0001) is appropriate for floating-point comparisons.


860-1025: Good lazy reset verification approach.

The test correctly handles the lazy reset mechanism by:

  1. Consuming budget with an initial request
  2. Waiting for the reset duration
  3. Making another request to trigger the lazy reset
  4. Verifying usage dropped to approximately one request's cost

The comment at Lines 961-964 explaining the lazy reset behavior is helpful for understanding the test design.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
plugins/governance/resolver_test.go (1)

331-345: Direct manipulation of internal store state breaks encapsulation.

This concern was raised in a previous review. The test directly accesses store.budgets.Store() which couples the test to implementation details.

plugins/governance/config_update_sync_test.go (1)

54-66: Unsafe type assertions may cause test panics.

This concern was raised in a previous review. Multiple chained type assertions without ok checks appear throughout this file and could cause panics if the response structure is unexpected.

plugins/governance/rate_limit_enforcement_test.go (1)

602-611: Incomplete verification of usage tracking.

This concern was raised in a previous review. The test extracts rate_limit_id but doesn't verify that usage counters were actually incremented after the request. Consider verifying the actual usage values.

🧹 Nitpick comments (18)
plugins/governance/team_budget_test.go (2)

60-61: Unchecked type assertions may cause test panics.

The type assertions on the response body could panic if the API response format is unexpected, making debugging harder.

-		vk := createVKResp.Body["virtual_key"].(map[string]interface{})
-		vkValues = append(vkValues, vk["value"].(string))
+		vkData, ok := createVKResp.Body["virtual_key"].(map[string]interface{})
+		if !ok {
+			t.Fatalf("Failed to extract virtual_key from response: %v", createVKResp.Body)
+		}
+		vkValue, ok := vkData["value"].(string)
+		if !ok {
+			t.Fatalf("Failed to extract VK value from response: %v", vkData)
+		}
+		vkValues = append(vkValues, vkValue)

149-156: Minor: shouldStop flag logic is slightly confusing.

The flag is checked at the start of loop body but set at the end, meaning one extra iteration runs after the budget threshold is crossed. This appears intentional (to trigger the rejection), but consider adding a brief comment to clarify the intent.

+		// Allow one more request after budget threshold to verify rejection
 		if shouldStop {
 			break
 		}
plugins/governance/vk_budget_test.go (2)

36-37: Unchecked type assertions may cause test panics.

Same issue as in team_budget_test.go - consider defensive type assertions with proper error messages.

-	vk := createVKResp.Body["virtual_key"].(map[string]interface{})
-	vkValue := vk["value"].(string)
+	vkData, ok := createVKResp.Body["virtual_key"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Failed to extract virtual_key from response: %v", createVKResp.Body)
+	}
+	vkValue, ok := vkData["value"].(string)
+	if !ok {
+		t.Fatalf("Failed to extract VK value from response: %v", vkData)
+	}

48-127: Consider extracting shared budget exhaustion test logic.

The budget exhaustion loop pattern (prompt generation, request execution, token extraction, cost calculation, budget tracking) is nearly identical across vk_budget_test.go, team_budget_test.go, and customer_budget_test.go. Extracting this into a shared helper in test_utils.go would reduce ~80 lines of duplication per file and improve maintainability.

Example helper signature:

func RunBudgetExhaustionTest(t *testing.T, vkValues []string, budgetLimit float64, entityType string) {
    // Shared loop logic here
}
plugins/governance/fixtures_test.go (2)

38-66: MockLogger discards format arguments and Fatal doesn't terminate.

Two potential issues with the MockLogger:

  1. Format arguments are ignored - only the format string is stored, so inspecting logs won't show the actual formatted message.
  2. Fatal() appends to errors but doesn't panic/terminate, which differs from real logger behavior and could mask issues in tests.
+import "fmt"
+
 func (ml *MockLogger) Error(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	ml.errors = append(ml.errors, fmt.Sprintf(format, args...))
 }

 // ... similar for other methods ...

 func (ml *MockLogger) Fatal(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	ml.errors = append(ml.errors, fmt.Sprintf(format, args...))
+	panic("fatal: " + fmt.Sprintf(format, args...)) // Or use t.Fatal if testing.T is available
 }

213-221: Redundant wrapper functions.

requireNoError and requireError are thin wrappers around require.NoError and require.Error that don't add additional functionality. Consider using the require functions directly to reduce indirection.

plugins/governance/store_test.go (1)

198-204: Direct mutation of budget object before store update.

The test mutates vkBudget.CurrentUsage directly (line 199) and then reassigns vk.Budget (line 200). This works because UpdateVirtualKeyInMemory is called afterward, but the mutation pattern is fragile - if the store internally copies budgets, this mutation might not take effect. Consider creating a new budget object with the updated usage instead.

-	// Test: If VK budget exceeds limit, should fail
-	vkBudget.CurrentUsage = 100.0
-	vk.Budget = vkBudget
+	// Test: If VK budget exceeds limit, should fail
+	updatedVkBudget := buildBudgetWithUsage("vk-budget", 100.0, 100.0, "1d")
+	vk.Budget = updatedVkBudget
+	vk.BudgetID = &updatedVkBudget.ID
 	store.UpdateVirtualKeyInMemory(vk)
plugins/governance/customer_budget_test.go (2)

60-61: Unchecked type assertions - same pattern as other budget tests.

Apply defensive type assertions as suggested for team_budget_test.go and vk_budget_test.go.


235-236: Unchecked type assertions in second test.

Same issue in TestCustomerBudgetExceededWithMultipleTeams.

plugins/governance/edge_cases_test.go (2)

91-92: Unsafe type assertions may cause test panics.

Chained type assertions without ok checks can panic if the response structure is unexpected (e.g., if the API response format changes or an error occurs).

-	vk := createVKResp.Body["virtual_key"].(map[string]interface{})
-	vkValue := vk["value"].(string)
+	vkRaw, ok := createVKResp.Body["virtual_key"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Failed to extract virtual_key from response")
+	}
+	vkValue, ok := vkRaw["value"].(string)
+	if !ok {
+		t.Fatalf("Failed to extract VK value from response")
+	}

101-101: Unused variable shouldStop is set but never triggers the break.

The variable shouldStop is set to true on Line 178, but the check on Line 173 (if shouldStop { break }) occurs after incrementing requestNum. Since the loop condition requestNum <= 50 is checked first, and shouldStop is only set when consumedBudget >= 0.08, the loop will continue one more iteration after setting shouldStop before breaking. This is likely intentional for post-hoc enforcement, but the logic could be clearer.

plugins/governance/usage_tracking_test.go (2)

113-129: Long sleep duration will slow CI significantly.

The 35-second sleeps in this test (and similar ones at Lines 225 and 430) will cause the test suite to take several minutes. Consider:

  1. Making these tests opt-in via build tags (e.g., //go:build integration)
  2. Reducing the reset duration in tests to a shorter interval (e.g., 5s with 7s wait)
  3. Exposing a test hook to manually trigger resets
-	tokenResetDuration := "30s"
+	tokenResetDuration := "5s" // Shorter duration for faster tests
 ...
-	t.Logf("Waiting 35 seconds for rate limit ticker to reset...")
-	time.Sleep(35 * time.Second)
+	t.Logf("Waiting 7 seconds for rate limit ticker to reset...")
+	time.Sleep(7 * time.Second)

53-59: Unsafe type assertions may cause test panics.

Multiple chained type assertions without ok checks appear throughout this file (Lines 53-56, 172-183, 413-424, etc.). Consider adding defensive checks or using a shared extraction helper.

-	data1 := getDataResp1.Body["data"].(map[string]interface{})
-	virtualKeysMap1 := data1["virtual_keys"].(map[string]interface{})
-	vkData1 := virtualKeysMap1[vkValue].(map[string]interface{})
-	rateLimitID, _ := vkData1["rate_limit_id"].(string)
+	data1, ok := getDataResp1.Body["data"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Failed to extract data from response")
+	}
+	virtualKeysMap1, ok := data1["virtual_keys"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Failed to extract virtual_keys")
+	}
+	vkData1, ok := virtualKeysMap1[vkValue].(map[string]interface{})
+	if !ok {
+		t.Fatalf("VK %s not found in response", vkValue)
+	}
+	rateLimitID, _ := vkData1["rate_limit_id"].(string)
plugins/governance/advanced_scenarios_test.go (1)

1209-1214: Inconsistent cascade deletion assertion for rate limits.

Budget cascade uses t.Fatalf (lines 1204-1206) but rate limit cascade only logs with t.Logf. This inconsistency suggests either the rate limit cascade behavior is intentionally different (should be documented) or this should also use t.Fatalf.

Consider aligning the assertion behavior:

 	// Rate limits should be gone
 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
 	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
 	}

Or if orphaning is intentional for rate limits but not budgets, add a comment explaining the design rationale.

plugins/governance/e2e_test.go (1)

642-755: Rate limit reset test has long wait time.

The 16-second wait for rate limit reset is necessary for correctness but will slow down test execution. Consider adding a comment explaining this is intentional or using a build tag for slow tests.

 // TestRateLimitResetBoundaryConditions verifies rate limit resets at exact boundaries
+// Note: This test uses a 15s reset duration and waits ~16s, making it slower than typical unit tests.
+// This is intentional to test actual time-based reset behavior.
 func TestRateLimitResetBoundaryConditions(t *testing.T) {
plugins/governance/test_utils.go (3)

90-141: Add timeout to HTTP client to prevent test hangs.

The http.Client has no timeout configured, which could cause tests to hang indefinitely if the server becomes unresponsive.

 // MakeRequest makes an HTTP request to the Bifrost API
 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)

143-152: rand.Seed is deprecated in Go 1.20+.

rand.Seed is deprecated and the global random source is auto-seeded since Go 1.20. Using the deprecated function may cause data races when tests run in parallel.

 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

Since Go 1.20, the global random source is automatically seeded, so the rand.Seed call can simply be removed.


247-268: Unused keyPath parameter in ExtractIDFromResponse.

The keyPath parameter is declared but never used - the function always searches through a hardcoded list of keys ("virtual_key", "team", "customer").

Either remove the unused parameter:

-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse) string {

Or use it as intended:

 func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
 	if resp.StatusCode >= 400 {
 		t.Fatalf("Request failed with status %d: %v", resp.StatusCode, resp.Body)
 	}

-	// Navigate through the response to find the ID
-	data := resp.Body
-	parts := []string{"virtual_key", "team", "customer"}
-	for _, part := range parts {
-		if val, ok := data[part]; ok {
-			if nested, ok := val.(map[string]interface{}); ok {
-				if id, ok := nested["id"].(string); ok {
-					return id
-				}
-			}
+	// Navigate through the response using keyPath to find the ID
+	if val, ok := resp.Body[keyPath]; ok {
+		if nested, ok := val.(map[string]interface{}); ok {
+			if id, ok := nested["id"].(string); ok {
+				return id
+			}
 		}
 	}

Note: Changing the signature would require updating all call sites.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51b5738 and da59879.

📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/governance/provider_budget_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/tracker_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/edge_cases_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/store_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/changelog.md
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/e2e_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/edge_cases_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/store_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/e2e_test.go
🧬 Code graph analysis (7)
plugins/governance/resolver_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (67-74)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
core/schemas/context.go (1)
  • BifrostContext (30-40)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (101-101)
plugins/governance/store_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (67-86)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (67-74)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
ui/lib/types/governance.ts (4)
  • Customer (37-44)
  • Team (27-35)
  • Budget (5-11)
  • RateLimit (13-25)
plugins/governance/customer_budget_test.go (5)
plugins/governance/test_utils.go (8)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • CalculateCost (64-73)
  • CreateTeamRequest (183-187)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (1)
  • Content (922-930)
plugins/governance/vk_budget_test.go (1)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/usage_tracking_test.go (5)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (13-13)
  • Content (922-930)
plugins/governance/team_budget_test.go (1)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (43)
plugins/governance/changelog.md (1)

1-3: LGTM! Changelog entries are clean and grammatically correct.

The previously flagged grammatical issues have been properly resolved ("an interface" and "e2e tests"). The three entries clearly document the changes: governance store refactoring for extensibility, rate-limits handling improvements, and the addition of e2e tests.

plugins/governance/store_test.go (2)

17-66: LGTM - Comprehensive VK retrieval tests.

The TestGovernanceStore_GetVirtualKey test covers active VKs, inactive VKs, and non-existent VKs with clear table-driven test structure.


68-101: LGTM - Well-designed concurrent read test.

The concurrent read test properly uses sync.WaitGroup and atomic.Int64 counters to verify thread-safe behavior under 10,000 concurrent reads.

plugins/governance/customer_budget_test.go (1)

8-160: LGTM - Comprehensive customer budget enforcement tests.

Both tests correctly validate that customer-level budgets are enforced across:

  1. Multiple VKs directly under a customer
  2. Multiple VKs across different teams under a customer

The test structure properly tracks budget consumption, verifies successful requests before exhaustion, and confirms budget-related error messages on rejection.

Also applies to: 162-335

plugins/governance/resolver_test.go (1)

15-550: LGTM - Comprehensive unit test coverage for BudgetResolver.

The test file provides thorough coverage of the BudgetResolver.EvaluateRequest workflow including:

  • Happy path and VK not found/blocked scenarios
  • Provider and model filtering
  • Rate limit enforcement (token and request limits)
  • Budget enforcement with expiration/reset handling
  • Multi-level budget hierarchy
  • Context population verification
  • Table-driven tests for isProviderAllowed and isModelAllowed helpers

The tests use appropriate fixtures and builders for test data setup.

plugins/governance/config_update_sync_test.go (1)

1-1062: LGTM - Comprehensive sync tests for config updates.

The test file provides thorough coverage of in-memory synchronization after updates:

  • VK rate limit updates with usage reset verification
  • VK budget updates with reset when new limit < current usage
  • Provider config rate limit and budget updates
  • Team and customer budget updates

Each test follows a clear pattern: create → consume → update → verify sync. The tests correctly verify that usage resets to 0 when the new limit is lower than current usage.

plugins/governance/rate_limit_enforcement_test.go (2)

8-130: Well-documented post-hoc enforcement test.

The comment on Lines 8-10 clearly explains the post-hoc enforcement behavior (request that exceeds is allowed, subsequent requests are blocked). The test logic correctly verifies this behavior by tracking consumed tokens and asserting rejection happens only after exceeding the limit.


132-217: Good request limit enforcement test.

The test correctly verifies that with a limit of 1 request, the first request succeeds and the second is rejected with a 429 status code. The error message validation ensures the rejection is due to rate limiting.

plugins/governance/rate_limit_test.go (10)

1-79: LGTM - Well-structured VK token rate limit test.

The test correctly creates a VK with a restrictive token limit and verifies that requests are eventually rejected. Good use of t.Parallel(), proper cleanup via defer testData.Cleanup(t), and appropriate logging.


81-152: LGTM - VK request rate limit test is correct.

The test properly validates request-based rate limiting with a low limit (3 requests/minute) and verifies rejection behavior.


154-231: LGTM - Provider-level token rate limit test.

Good coverage of provider-specific rate limiting with proper setup and assertions.


233-310: LGTM - Provider request rate limit test.

Correctly tests provider-level request rate limiting.


312-378: LGTM - Multiple providers with separate rate limits.

Good test for verifying that different providers maintain independent rate limits.


380-454: LGTM - Combined VK and provider rate limits.

Properly tests that both VK-level and provider-level rate limits can coexist.


456-590: LGTM - In-memory sync test for rate limits.

Comprehensive test verifying that rate limit updates propagate correctly to the in-memory store, including validation of the rate limit ID persistence and value updates.


592-663: LGTM - Combined token and request rate limit test.

Good coverage of scenarios where both token and request limits are configured together.


665-813: LGTM - Rate limit usage tracking in memory.

Thorough test that verifies usage counters are properly updated in the in-memory store after successful requests, with appropriate warnings when usage doesn't increase as expected.


815-979: LGTM - Provider-level rate limit usage tracking.

Well-structured test that validates independent usage tracking for multiple providers, correctly querying the rate limits map to verify per-provider state.

plugins/governance/advanced_scenarios_test.go (12)

1-171: LGTM - VK switch team after budget exhaustion.

Well-designed test that verifies the critical behavior of switching a VK to a different team after the original team's budget is exhausted. Good use of the 150-request upper bound to prevent infinite loops.


173-334: LGTM - VK switch customer after budget exhaustion.

Correctly tests the analogous customer-switching scenario.


336-528: LGTM - Hierarchical chain budget switching.

Comprehensive test of the full VK→Team→Customer hierarchy, correctly setting up the customer as the limiting budget.


530-645: LGTM - VK budget update after exhaustion.

Properly tests that updating a VK's budget after exhaustion allows subsequent requests.


647-780: LGTM - Team budget update after exhaustion.

Good test coverage for team-level budget updates.


782-933: LGTM - Customer budget update after exhaustion.

Correctly tests customer budget updates in the context of a hierarchical setup (customer→team→VK).


935-1075: LGTM - Provider config budget update after exhaustion.

Good test for provider-specific budget updates using the provider config ID.


1077-1218: LGTM - VK deletion cascade test (aside from rate limit assertion).

Good comprehensive test for cascade deletion of VK and associated resources.


1220-1303: LGTM - Team deletion deletes budget.

Properly verifies team deletion removes the team from memory.


1305-1384: LGTM - Customer deletion deletes budget.

Correctly tests customer deletion behavior.


1386-1488: LGTM - Team deletion sets VK team_id to nil.

Good test for verifying referential integrity after team deletion.


1490-1588: LGTM - Customer deletion sets VK customer_id to nil.

Correctly tests the analogous customer deletion scenario.

plugins/governance/e2e_test.go (10)

1-181: LGTM - Multiple VKs sharing team budget fairness test.

Well-structured test that verifies the critical behavior where multiple VKs share a team budget. Good documentation of the POST-HOC budget enforcement semantics.


183-352: LGTM - Full budget hierarchy enforcement.

Comprehensive test of all 4 budget hierarchy levels (Provider→VK→Team→Customer) with proper verification that the most restrictive limit (provider) is enforced.


354-495: LGTM - Failed requests do not consume budget.

Good test verifying that failed requests (4xx/5xx) don't consume budget while successful requests do.


497-640: LGTM - Inactive virtual key blocking.

Comprehensive test covering the full lifecycle: active VK works → deactivate → blocked → reactivate → works again.


757-852: LGTM - Concurrent requests to same VK.

The test correctly handles concurrent requests by collecting success counts via mutex-protected increments rather than calling t.Fatalf from goroutines. Good handling of the concurrency testing pattern.


854-1025: LGTM - Budget state after reset.

Comprehensive test of lazy budget reset mechanism with good documentation explaining the expected behavior (reset happens when triggered by a new request, usage after reset equals cost of the triggering request).


1027-1140: LGTM - Team deletion cascade.

Good test covering team deletion behavior with associated VKs.


1142-1255: LGTM - VK deletion cascade.

Properly tests VK deletion and removal from in-memory store.


1257-1382: LGTM - Weighted provider load balancing.

Good test for weighted distribution between providers, with appropriate tolerance for the 99%/1% split in a 10-request sample.


1384-1487: LGTM - Provider fallback mechanism.

Well-designed test that verifies fallback routing when the primary provider (Anthropic at 99% weight) doesn't support the requested model (gpt-4o).

plugins/governance/test_utils.go (3)

1-73: LGTM - Well-structured model cost definitions.

Good centralization of model pricing data for test cost calculations. The CalculateCost function provides a clean interface for budget testing.


270-297: LGTM - Error checking utilities.

The CheckErrorMessage and contains functions are well-implemented with proper case-insensitive matching.


299-367: LGTM - GlobalTestData cleanup mechanism.

Good pattern for tracking and cleaning up test resources with proper tolerance for 404 responses (resources already deleted or never created).

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from da59879 to bc8095a Compare December 11, 2025 07:25
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-08-feat_governance_plugin_refactor branch from 687d705 to 909ae57 Compare December 11, 2025 07:25
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (12)
plugins/governance/team_budget_test.go (1)

66-156: Loop termination logic prevents testing the actual budget enforcement.

The shouldStop flag causes the loop to exit after setting shouldStop = true when consumedBudget >= teamBudget, but before issuing the next request that would actually be rejected. This means the test may never observe the expected 4xx "budget exceeded" response.

Remove the shouldStop logic and rely on the 50-request cap and API enforcement to terminate the loop:

-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
@@ -146,13 +145,6 @@

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/resolver_test.go (1)

331-345: Direct manipulation of internal store state bypasses encapsulation.

Line 337 directly calls store.budgets.Store(), coupling the test to the LocalGovernanceStore implementation details. This makes the test fragile if the internal storage mechanism changes.

Consider adding a test helper method to the store for setting budget usage in test fixtures, as suggested in the prior review.

-	// Get the governance data to update the budget directly
-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	// Use a test helper to set budget usage for test fixture
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)

Add to LocalGovernanceStore:

// SetBudgetCurrentUsageForTest is a test-only helper
func (gs *LocalGovernanceStore) SetBudgetCurrentUsageForTest(budgetID string, usage float64) error {
    if val, ok := gs.budgets.Load(budgetID); ok {
        budget := val.(*configstoreTables.TableBudget)
        budget.CurrentUsage = usage
        gs.budgets.Store(budgetID, budget)
        return nil
    }
    return fmt.Errorf("budget not found: %s", budgetID)
}
plugins/governance/usage_tracking_test.go (2)

393-394: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. While the current loop (0-2) is safe, this pattern is non-obvious and fragile if the loop count changes.

+import "fmt"
+
 // In the loop:
-			Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+			Content: fmt.Sprintf("Test request %d for reset ticker test.", i),

250-340: Test creates VK without rate limit but checks for rate_limit_id.

TestInMemoryUsageUpdateOnRequest creates a VK with no rate limit or budget configuration (Lines 258-264), then checks rate_limit_id at Line 332. Since no rate limit was configured, rateLimitID will always be empty, and the conditional at Lines 333-337 will always take the "No rate limit on VK" branch.

This test doesn't meaningfully verify in-memory usage tracking. Consider either:

  1. Adding a rate limit to the VK configuration and asserting usage is tracked, or
  2. Adding a budget and verifying budget usage tracking
plugins/governance/config_update_sync_test.go (2)

54-66: Unsafe chained type assertions may cause test panics.

Multiple chained type assertions without ok checks (e.g., Line 55: data1["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})) can cause panics if the response structure is unexpected. This pattern is repeated throughout the file.

Consider extracting a helper function for safe nested map access that fails the test gracefully instead of panicking.

// Helper for safe nested extraction
func extractNestedMap(t *testing.T, data map[string]interface{}, keys ...string) map[string]interface{} {
    current := data
    for _, key := range keys {
        val, ok := current[key]
        if !ok {
            t.Fatalf("Key %q not found in map", key)
        }
        nested, ok := val.(map[string]interface{})
        if !ok {
            t.Fatalf("Value at %q is not a map", key)
        }
        current = nested
    }
    return current
}

166-210: "Higher limit" test branch doesn't verify usage preservation.

The comment at Line 166 states "usage should NOT reset", but at this point usage is already 0 from the previous lower-limit update (Line 164). The test only verifies that the new limit is reflected, not that non-zero usage would be preserved.

Either:

  1. Make usage non-zero before raising the limit (e.g., make another request) and assert usage is preserved, or
  2. Update the comment to reflect that this just verifies limit propagation when usage is already 0
-	// Test UPDATE with higher limit (usage should NOT reset)
+	// Test UPDATE with higher limit (verifies limit propagation; usage already 0 from prior reset)
plugins/governance/edge_cases_test.go (1)

59-79: VK budget is more restrictive than provider budget - test expectation is incorrect.

The comment on Line 97 states "Provider budget should be the limiting factor (most restrictive)", but the budget hierarchy is:

  • VK budget: $0.01 (Line 60)
  • Provider budget: $0.08 (Line 76)
  • Team budget: $0.12 (Line 38)
  • Customer budget: $0.15 (Line 16)

The VK-level budget ($0.01) is actually the most restrictive and will be exhausted first. Either swap the budget values to make the provider budget genuinely the most restrictive, or update the test expectation/comment.

-	// Create a VK with even tighter budget and provider-specific budgets
-	vkBudget := 0.01
+	// Create a VK with moderate budget; provider budget is tighter
+	vkBudget := 0.15

Or make provider budget the tightest:

 				Budget: &BudgetRequest{
-					MaxLimit:      0.08, // Even tighter provider budget
+					MaxLimit:      0.005, // Most restrictive - provider budget
plugins/governance/rate_limit_enforcement_test.go (2)

489-489: Fragile string conversion still present.

This was flagged in a previous review but the pattern string(rune('0'+i)) remains. It only works correctly for i in range 0-9.

-					Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+					Content: fmt.Sprintf("Request %d to test both limits.", i+1),

602-611: Incomplete verification of usage tracking.

The test name is TestRateLimitInMemoryUsageTracking but it only verifies that rate_limit_id exists, not that usage counters were actually incremented after the request. This was flagged in a previous review.

Consider adding verification:

rateLimitsMap := data["rate_limits"].(map[string]interface{})
if rateLimitID != "" {
    rateLimitData := rateLimitsMap[rateLimitID].(map[string]interface{})
    tokenUsage, _ := rateLimitData["token_current_usage"].(float64)
    if tokenUsage <= 0 {
        t.Fatalf("Token usage not tracked: expected > 0, got %.0f", tokenUsage)
    }
    t.Logf("Rate limit %s tracked %.0f tokens ✓", rateLimitID, tokenUsage)
}
plugins/governance/advanced_scenarios_test.go (1)

1209-1214: Inconsistent cascade deletion assertion for rate limits vs budgets.

The test now correctly asserts that budgets are cascade-deleted (lines 1204-1207), but rate limits are still allowed to be orphaned with just a log message. This inconsistency was flagged in a previous review.

Either enforce the same assertion for rate limits:

 	// Rate limits should be gone
 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}

Or add a comment explaining why orphaning is intentional for rate limits but not budgets.

plugins/governance/e2e_test.go (1)

1354-1363: Provider detection still relies on fragile string matching.

The test uses strings.Contains(model, "openai") and strings.Contains(model, "gpt-4o") to detect which provider handled the request. This approach was flagged in a previous review as fragile since it only handles specific providers and assumes naming conventions won't change.

Consider parsing the model field more robustly (e.g., splitting on "/" to extract the provider prefix) or using explicit provider information from response fields if available.

plugins/governance/test_utils.go (1)

91-92: Add HTTP client timeout and t.Helper() to MakeRequest.

This was flagged in a previous review but not addressed. The bare http.Client{} with no timeout can cause test hangs. Also, t.Helper() should be called for proper test failure attribution.

 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+	t.Helper()
-	client := &http.Client{}
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)
🧹 Nitpick comments (14)
plugins/governance/tracker_test.go (1)

76-82: Replace no-op assertion with meaningful verification.

assert.True(t, true) provides no value. Consider checking the mock logger for expected warning/error messages, or simply remove this assertion since the test's purpose is to verify no panic occurs.

 	// Should not panic or error
 	tracker.UpdateUsage(context.Background(), update)

 	time.Sleep(100 * time.Millisecond)
-	// Just verify it doesn't crash
-	assert.True(t, true)
+	// Verify warning was logged for missing VK
+	assert.True(t, len(logger.warnings) > 0 || len(logger.errors) > 0, "Expected warning or error for missing VK")

Alternatively, if no log is expected, simply remove the assertion entirely since test completion without panic is sufficient.

plugins/governance/provider_budget_test.go (1)

58-145: Consider extracting the shared budget exhaustion loop into a helper.

Both subtests have nearly identical structure. Extract a helper function to reduce duplication:

func runProviderBudgetExhaustionTest(t *testing.T, vkValue, model string, providerBudget float64) {
    consumedBudget := 0.0
    requestNum := 1
    var lastSuccessfulCost float64
    
    for requestNum <= 50 {
        // ... shared loop logic with model parameter
    }
}

Then call it from each subtest with the appropriate model string.

Also applies to: 148-235

plugins/governance/fixtures_test.go (1)

62-66: Fatal method doesn't terminate execution.

The Fatal method only appends to the errors slice but doesn't panic or terminate. This may be intentional for test isolation, but it differs from typical logger behavior where Fatal would cause program termination.

If tests expect fatal calls to be captured without termination, document this behavior:

+// Fatal appends to errors but does NOT terminate - allows tests to verify fatal conditions
 func (ml *MockLogger) Fatal(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
 	ml.errors = append(ml.errors, format)
 }
plugins/governance/in_memory_sync_test.go (2)

51-52: Unsafe type assertions may panic on unexpected responses.

Direct type assertions like data["virtual_keys"].(map[string]interface{}) will panic if the key is missing or has a different type. Consider using the comma-ok idiom for robustness:

virtualKeysMap, ok := data["virtual_keys"].(map[string]interface{})
if !ok {
    t.Fatalf("virtual_keys not found or invalid type in response")
}

This pattern appears throughout the file (lines 51-52, 60-61, 98-100, 168-169, etc.).

Also applies to: 60-61


520-524: Count mismatch warning without test failure may mask issues.

The test logs a warning when VK counts differ between endpoints but doesn't fail. While the comment explains this can happen due to parallel tests, consider either:

  1. Making this test non-parallel to get deterministic counts
  2. Adding a threshold or retry logic
  3. At minimum, using t.Logf with [WARN] prefix consistently (add space after [WARN])
-		t.Logf("[WARN]VK count mismatch between /data endpoint and /virtual-keys endpoint: %d vs %d (this can happen because of parallel tests)", vkCount, int(vksFromEndpoint))
+		t.Logf("[WARN] VK count mismatch between /data endpoint and /virtual-keys endpoint: %d vs %d (this can happen due to parallel tests)", vkCount, int(vksFromEndpoint))
plugins/governance/customer_budget_test.go (3)

149-156: shouldStop flag checked after incrementing requestNum, causing an extra loop iteration.

When consumedBudget >= customerBudget at Line 153, shouldStop is set to true, but requestNum has already been incremented at Line 146. The check at Line 149 only breaks after the next iteration has started. This means one additional request may be attempted after the budget threshold is reached locally.

This doesn't affect test correctness since the server enforces the budget, but the local tracking logic is misleading.

 		requestNum++
 		vkIndex++
 
-		if shouldStop {
-			break
-		}
-
 		if consumedBudget >= customerBudget {
-			shouldStop = true
+			break
 		}
 	}

324-330: Same shouldStop pattern issue as the first test.

The same unnecessary complexity with shouldStop exists here. Consider simplifying to a direct break when consumedBudget >= customerBudget.


78-95: Duplicated long prompt string across both tests.

The identical long prompt construction is repeated in both test functions. Consider extracting to a package-level constant or helper function in test_utils.go to reduce duplication.

Also applies to: 253-270

plugins/governance/usage_tracking_test.go (1)

512-512: Same fragile string conversion pattern.

The same string(rune('0'+i)) pattern is used here. Consider using fmt.Sprintf for consistency and safety.

plugins/governance/rate_limit_test.go (1)

54-54: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digits for i in range 0-9. For i >= 10, this produces unexpected characters. Use fmt.Sprintf or strconv.Itoa instead.

-					Content: "Short test request " + string(rune('0'+i)) + " for token limit.",
+					Content: fmt.Sprintf("Short test request %d for token limit.", i+1),

This same pattern appears at lines 127, 206, 285, and 642 in this file - consider fixing all instances.

plugins/governance/e2e_test.go (1)

646-755: Long sleep durations may slow test suite.

TestRateLimitResetBoundaryConditions and TestBudgetStateAfterReset include 15-16 second waits to test reset behavior. While necessary for testing time-based functionality, this significantly impacts test suite execution time.

Consider adding a skip for short test runs:

func TestRateLimitResetBoundaryConditions(t *testing.T) {
	if testing.Short() {
		t.Skip("Skipping reset boundary test in short mode")
	}
	t.Parallel()
	// ...
}
plugins/governance/test_utils.go (3)

143-152: Deprecated rand.Seed usage.

rand.Seed is deprecated since Go 1.20. The global random source is now automatically seeded. You can either remove the Seed call entirely or use a local random source.

 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

Or for deterministic testing with a local source:

func generateRandomID() string {
	r := rand.New(rand.NewSource(time.Now().UnixNano()))
	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
	b := make([]byte, 8)
	for i := range b {
		b[i] = letters[r.Intn(len(letters))]
	}
	return string(b)
}

330-367: Add t.Helper() to Cleanup method.

Since Cleanup uses MakeRequest which may call t.Fatalf, adding t.Helper() will ensure proper test failure attribution to the calling test rather than the utility function.

 // Cleanup deletes all created resources
 func (g *GlobalTestData) Cleanup(t *testing.T) {
+	t.Helper()
 	// Delete virtual keys

247-268: Unused keyPath parameter in ExtractIDFromResponse.

The keyPath parameter is accepted but never used - the function always searches through hardcoded keys ["virtual_key", "team", "customer"]. Consider either using the parameter or removing it to avoid confusion.

-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {

Note: This would require updating all call sites, so evaluate if the parameter was intended for future use.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da59879 and bc8095a.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugins/governance/changelog.md
  • plugins/governance/vk_budget_test.go
  • plugins/governance/store_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/resolver_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/go.mod
  • plugins/governance/edge_cases_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/usage_tracking_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/resolver_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/usage_tracking_test.go
🧬 Code graph analysis (9)
plugins/governance/resolver_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (355-362)
plugins/governance/resolver.go (11)
  • NewBudgetResolver (70-75)
  • DecisionAllow (18-18)
  • DecisionVirtualKeyNotFound (19-19)
  • DecisionVirtualKeyBlocked (20-20)
  • DecisionProviderBlocked (26-26)
  • DecisionModelBlocked (25-25)
  • DecisionTokenLimited (23-23)
  • DecisionRequestLimited (24-24)
  • DecisionBudgetExceeded (22-22)
  • DecisionRateLimited (21-21)
  • Decision (15-15)
core/schemas/context.go (1)
  • BifrostContext (30-40)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (101-101)
plugins/governance/tracker_test.go (7)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (66-85)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (355-362)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
plugins/governance/tracker.go (2)
  • NewUsageTracker (51-66)
  • UsageUpdate (16-29)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
core/utils.go (1)
  • IsFinalChunk (198-213)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (12)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/advanced_scenarios_test.go (1)
plugins/governance/test_utils.go (17)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
  • ProviderConfigRequest (167-174)
  • CreateRateLimitRequest (202-207)
plugins/governance/e2e_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • ProviderConfigRequest (167-174)
  • CreateRateLimitRequest (202-207)
plugins/governance/rate_limit_test.go (3)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/provider_budget_test.go (1)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ProviderConfigRequest (167-174)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/usage_tracking_test.go (3)
plugins/governance/test_utils.go (8)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/providers/gemini/types.go (2)
  • Role (13-13)
  • Content (922-930)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (33)
plugins/governance/go.mod (3)

1-112: Inconsistency between AI summary and manifest changes.

The AI-generated summary states "All changes are test-only," but the go.mod file contains dependency version changes (downgrades on lines 8–9), which are manifest-level modifications affecting the entire module, not just tests. While adding testify for test support is aligned with adding tests, the version downgrades of core dependencies should be called out explicitly in commit messages and PR descriptions to clarify intent.


10-10: Testify addition is appropriate for test suite.

Adding github.com/stretchr/testify v1.11.1 is a sensible choice to support the new governance test suite. The version is current and stable for Go 1.24.3.


8-9: Cannot verify dependency versions against target branch—target branch does not exist.

The target branch 12-08-feat_governance_plugin_refactor referenced in the stack context is not accessible in this repository (only main branch exists). Git history shows this file was added only in the current commit bc8095a chore: governance tests added, so there are no prior versions to compare against. If this PR is part of a stacked change, provide the correct target branch name or PR reference for verification. Otherwise, confirm whether v1.2.34 (core) and v1.1.43 (framework) are the intended versions for this test addition.

plugins/governance/team_budget_test.go (2)

1-6: LGTM on imports and package declaration.

Clean minimal imports for the test file.


10-35: Test setup looks correct.

Proper use of t.Parallel(), test data initialization with cleanup via defer, and team creation with budget configuration.

plugins/governance/tracker_test.go (2)

15-54: Test for failed request budget handling looks good.

Correctly verifies that failed requests do not update budget usage. The async sleep and assertion pattern is appropriate for this scenario.


84-152: Streaming optimization test is well-structured.

Good coverage of the streaming chunk handling: verifying tokens are updated on non-final chunks and request counter incremented on final chunk.

plugins/governance/provider_budget_test.go (1)

9-12: Test setup is correct.

Proper parallel execution and cleanup handling.

plugins/governance/fixtures_test.go (3)

14-66: Well-implemented MockLogger with thread-safety.

Good use of mutex for concurrent access protection. The logging methods correctly capture the format string for later inspection.


68-184: Clean test data builders.

Well-structured builder functions with clear naming conventions. The composition pattern (e.g., buildVirtualKeyWithBudget calling buildVirtualKey) is good for maintainability.


195-221: Good assertion helpers.

The t.Helper() calls ensure proper stack traces on failure. The wrappers around testify's require functions add clarity.

plugins/governance/in_memory_sync_test.go (2)

9-126: VK update sync test is comprehensive.

Good coverage of create → verify in-memory → update → verify in-memory updated flow. The 500ms sleep for async updates is reasonable.


128-234: Team and Customer update tests follow consistent patterns.

Good consistency with the VK update test structure.

plugins/governance/customer_budget_test.go (1)

1-160: Overall test logic is sound for validating customer budget enforcement.

The tests correctly verify that customer-level budgets are enforced as a shared limit across multiple VKs (direct and through teams). The cleanup via testData.Cleanup(t) ensures proper resource management.

plugins/governance/edge_cases_test.go (1)

10-183: Test structure and cleanup are correctly implemented.

Apart from the budget value mismatch, the test properly validates the criss-cross budget hierarchy scenario with appropriate resource cleanup and logging.

plugins/governance/resolver_test.go (5)

1-78: Happy path and error case tests are well-structured.

The tests for allowed requests, VK not found, and VK blocked scenarios follow good patterns with clear setup, execution, and assertion phases.


79-141: Provider and model filtering tests provide good coverage.

Tests correctly verify that provider and model allowlists are enforced, with appropriate decision assertions.


143-236: Rate limit tests cover token limits, request limits, and expiry scenarios.

The tests properly validate rate limit enforcement including the reset behavior for expired limits.


402-510: Table-driven tests for isProviderAllowed and isModelAllowed are well-organized.

Good use of table-driven tests with clear test case names and expected outcomes.


512-550: Context population test verifies governance IDs are correctly set.

The test properly validates that VK, team, and customer IDs are populated in the BifrostContext after evaluation.

plugins/governance/usage_tracking_test.go (2)

449-451: Reset verification logic has been corrected.

The assertion now properly checks usageAfterReset >= usageBeforeReset to verify the budget was reset, addressing the prior review feedback.


8-129: Rate limit reset test is thorough but has long wait time.

The 35-second sleep makes this test slow. Consider if this can be parameterized or if there's a way to trigger reset more directly in tests.

plugins/governance/config_update_sync_test.go (6)

14-165: VK rate limit update sync test covers the core scenarios well.

The test properly validates that:

  1. Initial state is correctly reflected in memory
  2. Usage is tracked after requests
  3. Lowering the limit below current usage triggers a reset

215-362: VK budget update sync test follows the same solid pattern.

Correctly validates budget synchronization to memory and usage reset behavior.


368-539: Provider rate limit update sync test is comprehensive.

Good coverage of provider-level rate limit configuration updates and memory synchronization.


545-709: Team budget update sync test correctly validates team-level budget hierarchy.

Properly tests budget updates at the team level with VK consumption.


715-896: Customer budget update sync test validates customer-level hierarchy.

Good coverage of the full customer -> team -> VK hierarchy with budget synchronization.


902-1062: Provider budget update sync test completes the provider-level coverage.

Validates provider config budget updates and memory synchronization.

plugins/governance/rate_limit_test.go (1)

1-79: Well-structured rate limit test suite.

The tests are comprehensive, covering VK-level and provider-level rate limits with proper cleanup via defer testData.Cleanup(t). The use of t.Parallel() enables concurrent test execution.

plugins/governance/rate_limit_enforcement_test.go (1)

8-130: Good enforcement verification pattern.

The TestVirtualKeyTokenRateLimitEnforcement test correctly implements post-hoc rate limit verification - it tracks consumed tokens and verifies rejection happens only after the limit is exceeded, not before. The clear documentation of the post-hoc behavior at lines 8-10 is helpful.

plugins/governance/advanced_scenarios_test.go (1)

1-171: Comprehensive budget exhaustion and switching tests.

The test structure is well-designed with clear scenario documentation, proper resource cleanup, and thorough verification of post-hoc budget enforcement. The hierarchical budget testing covers VK→Team→Customer chains effectively.

plugins/governance/e2e_test.go (1)

809-836: Concurrent request handling improved.

The goroutine safety issue from the previous review has been addressed - the code now uses mutex-protected counters and avoids calling t.Fatalf directly for assertion failures within goroutines. Note that MakeRequest can still call t.Fatalf internally for setup failures, but those are exceptional cases.

plugins/governance/test_utils.go (1)

15-61: Well-designed test model cost definitions.

The ModelCost struct and TestModels map provide a clean way to calculate expected costs in tests. The pricing data aligns with the model schemas defined in core/schemas/models.go.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (14)
plugins/governance/vk_budget_test.go (1)

42-47: Simplify loop termination; drop shouldStop to avoid fragile early-exit logic.

You already cap the loop at 50 iterations and rely on the backend to signal budget exhaustion. The extra shouldStop flag (set when consumedBudget >= vkBudget and checked on the next iteration) adds brittle client-side stopping logic and can cause the test to stop before the server actually returns a budget error if your cost estimate overshoots.

Rely on the API’s own enforcement and keep consumedBudget purely diagnostic:

-	consumedBudget := 0.0
-	requestNum := 1
-	var lastSuccessfulCost float64
-
-	var shouldStop = false
+	consumedBudget := 0.0
+	requestNum := 1
+	var lastSuccessfulCost float64
@@
-		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= vkBudget {
-			shouldStop = true
-		}
+		requestNum++
+		// Let governance enforcement (or the 50‑request cap) terminate the loop;
+		// `consumedBudget` is used only for logging/diagnostics.

Also applies to: 48-67, 118-127

plugins/governance/team_budget_test.go (1)

66-72: Remove shouldStop and let the API enforce the team budget.

This test shares the same fragile shouldStop pattern previously flagged in team_budget_test.go: once consumedBudget >= teamBudget, you only run a single further request before unconditionally breaking, which can cause the test to fail without ever observing the expected team‑budget 4xx if the cost estimate overshoots or rounding differs from server logic.

You already have a 50‑request cap; drop the manual stop and keep consumedBudget for logging:

-	consumedBudget := 0.0
-	requestNum := 1
-	var lastSuccessfulCost float64
-	var shouldStop = false
-	vkIndex := 0
+	consumedBudget := 0.0
+	requestNum := 1
+	var lastSuccessfulCost float64
+	vkIndex := 0
@@
-		requestNum++
-		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
+		requestNum++
+		vkIndex++
+		// Let governance enforcement (or the 50‑request cap) terminate the loop;
+		// `consumedBudget` is used only for logging/diagnostics.

Also applies to: 73-96, 146-156

plugins/governance/customer_budget_test.go (2)

66-72: Avoid shouldStop in VK-based customer budget test; rely on backend enforcement.

The shouldStop flag adds fragile client-side stopping logic and can prevent you from ever seeing the “customer budget exceeded” error if cost estimates overshoot or drift from server accounting. The 50‑iteration bound already prevents runaway tests.

You can safely remove shouldStop and just keep consumedBudget for logging:

-	consumedBudget := 0.0
-	requestNum := 1
-	var lastSuccessfulCost float64
-	var shouldStop = false
-	vkIndex := 0
+	consumedBudget := 0.0
+	requestNum := 1
+	var lastSuccessfulCost float64
+	vkIndex := 0
@@
-		requestNum++
-		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= customerBudget {
-			shouldStop = true
-		}
+		requestNum++
+		vkIndex++
+		// Let governance enforcement (or the 50‑request cap) terminate the loop;
+		// `consumedBudget` is used only for logging/diagnostics.

Also applies to: 73-96, 146-156


241-247: Apply same loop simplification to multi‑team customer budget test.

The second test repeats the same shouldStop pattern; it has the same risk profile and is unnecessary given the 50‑request cap.

-	consumedBudget := 0.0
-	requestNum := 1
-	var lastSuccessfulCost float64
-	var shouldStop = false
-	vkIndex := 0
+	consumedBudget := 0.0
+	requestNum := 1
+	var lastSuccessfulCost float64
+	vkIndex := 0
@@
-		requestNum++
-		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= customerBudget {
-			shouldStop = true
-		}
+		requestNum++
+		vkIndex++
+		// Let governance enforcement (or the 50‑request cap) terminate the loop;
+		// `consumedBudget` is used only for logging/diagnostics.

Also applies to: 248-271, 321-331

plugins/governance/provider_budget_test.go (2)

57-63: Drop shouldStop in OpenAI provider budget loop.

As with the other budget exhaustion tests, shouldStop is redundant with the requestNum <= 50 bound and can cause premature termination if the approximate consumedBudget crosses providerBudget before the backend signals a budget error.

Simplify the loop:

-		providerBudget := 0.01
-		consumedBudget := 0.0
-		requestNum := 1
-		var lastSuccessfulCost float64
-		var shouldStop = false
+		providerBudget := 0.01
+		consumedBudget := 0.0
+		requestNum := 1
+		var lastSuccessfulCost float64
@@
-			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
+			requestNum++
+			// Let governance enforcement (or the 50‑request cap) terminate the loop;
+			// `consumedBudget` is used only for logging/diagnostics.

Also applies to: 65-84, 132-141


147-153: Apply same loop simplification to Anthropic provider budget test.

Mirror the fix in the Anthropic subtest to avoid divergent behavior between providers:

-		providerBudget := 0.01
-		consumedBudget := 0.0
-		requestNum := 1
-		var lastSuccessfulCost float64
-		var shouldStop = false
+		providerBudget := 0.01
+		consumedBudget := 0.0
+		requestNum := 1
+		var lastSuccessfulCost float64
@@
-			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
+			requestNum++
+			// Let governance enforcement (or the 50‑request cap) terminate the loop;
+			// `consumedBudget` is used only for logging/diagnostics.

Also applies to: 155-174, 222-231

plugins/governance/usage_tracking_test.go (2)

256-264: TestInMemoryUsageUpdateOnRequest doesn’t actually verify usage changes.

This test currently:

  • Creates a VK with no explicit budget/rate limit.
  • Sends a request and logs tokensUsed.
  • Fetches /api/governance/data and only logs whether rate_limit_id is present.

It never asserts that any in‑memory counter (budget usage or rate‑limit usage) changed, so it will pass even if usage tracking is broken.

Consider either:

  • Attaching a rate limit (or budget) to the VK, then asserting the corresponding usage field increases after the request, or
  • Narrowing the test’s stated purpose if it’s only meant to sanity‑check that the VK appears in /data.

For example, to assert rate‑limit usage:

-	// Rate limit should exist and be updated
-	rateLimitID, _ := vkData["rate_limit_id"].(string)
-	if rateLimitID != "" {
-		t.Logf("Rate limit tracking is enabled for VK ✓")
-	} else {
-		t.Logf("No rate limit on VK (optional)")
-	}
+	rateLimitID, _ := vkData["rate_limit_id"].(string)
+	if rateLimitID == "" {
+		t.Skip("No rate limit attached to VK; nothing to assert for usage counters")
+	}
+
+	rateLimitsMap := data["rate_limits"].(map[string]interface{})
+	rlData, ok := rateLimitsMap[rateLimitID].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Rate limit %s not found in in‑memory store", rateLimitID)
+	}
+	tokenUsage, _ := rlData["current_token_usage"].(float64)
+	if tokenUsage <= 0 {
+		t.Fatalf("Token usage not tracked: expected > 0, got %.0f", tokenUsage)
+	}
+	t.Logf("Rate limit %s tracked %.0f tokens ✓", rateLimitID, tokenUsage)

Also applies to: 278-340


383-395: Use safe integer→string conversion instead of string(rune('0'+i)).

string(rune('0'+i)) only works for i in 0–9; for larger i it produces non‑digit characters. Even though these loops are short today, this pattern is brittle and was previously flagged.

Prefer fmt.Sprintf (or strconv.Itoa) for clarity:

-import (
-	"testing"
-	"time"
-)
+import (
+	"fmt"
+	"testing"
+	"time"
+)
@@
-						Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+						Content: fmt.Sprintf("Test request %d for reset ticker test.", i+1),
@@
-						Content: "Persistence test request " + string(rune('0'+i)) + ".",
+						Content: fmt.Sprintf("Persistence test request %d.", i+1),

Also applies to: 501-513

plugins/governance/rate_limit_enforcement_test.go (2)

479-489: Replace fragile string(rune('0'+i)) with proper formatting.

Same issue as in usage_tracking_test.go: string(rune('0'+i)) assumes i ∈ [0,9] and will produce nonsense for larger values. Even if the loop bound is currently 5, this pattern is error‑prone and inconsistent.

Use a standard formatter instead:

-import (
-	"testing"
-	"time"
-)
+import (
+	"fmt"
+	"testing"
+	"time"
+)
@@
-						Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+						Content: fmt.Sprintf("Request %d to test both limits.", i+1),

520-525: TestRateLimitInMemoryUsageTracking should assert usage counters, not just presence.

This test:

  • Sends one request, extracts tokensUsed, waits 1s, then
  • Only checks whether rate_limit_id is non‑empty and logs a message.

It doesn’t actually verify that any rate‑limit usage field increased, so it can pass even if tracking is broken.

Consider asserting that the relevant usage metric increased, similar to the suggestion in the earlier bot comment on this file:

 	data := getDataResp.Body["data"].(map[string]interface{})
 	virtualKeysMap := data["virtual_keys"].(map[string]interface{})
 	vkData := virtualKeysMap[vkValue].(map[string]interface{})
 	rateLimitID, _ := vkData["rate_limit_id"].(string)
 
-	if rateLimitID != "" {
-		t.Logf("Rate limit %s is configured and tracking usage ✓", rateLimitID)
-	} else {
-		t.Logf("Rate limit is configured ✓")
-	}
+	if rateLimitID == "" {
+		t.Skip("No rate limit ID found for VK; cannot assert usage tracking")
+	}
+
+	rateLimitsMap := data["rate_limits"].(map[string]interface{})
+	rlData, ok := rateLimitsMap[rateLimitID].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Rate limit %s not found in in‑memory store", rateLimitID)
+	}
+	tokenUsage, _ := rlData["current_token_usage"].(float64)
+	if tokenUsage < float64(tokensUsed) {
+		t.Fatalf("Token usage not tracked correctly: expected >= %d, got %.0f", tokensUsed, tokenUsage)
+	}
+	t.Logf("Rate limit %s tracked %.0f tokens ✓", rateLimitID, tokenUsage)

Also applies to: 575-585, 592-612

plugins/governance/e2e_test.go (1)

809-836: Verify goroutine error handling is safe.

A past review flagged that calling t.Fatalf from within goroutines is unsafe (it only exits the goroutine, not the test). The review was marked as addressed in commits 687d705 to da59879. Please confirm that the goroutine implementation now properly collects errors and reports them from the main goroutine, rather than calling t.Fatalf directly from within goroutines.

plugins/governance/test_utils.go (1)

91-141: Verify MakeRequest has timeout and t.Helper().

A past review flagged that MakeRequest creates a bare http.Client without a timeout and doesn't call t.Helper(), which can cause test hangs and obscure failure attribution. The review was marked as addressed in earlier commits. Please confirm the current implementation includes both a 30-second timeout and t.Helper() call.

plugins/governance/rate_limit_test.go (1)

38-106: Unsafe type assertions may cause test panics.

Multiple chained type assertions without ok checks appear throughout this file (lines 38-39, 54-59, 99-106, and many more). This pattern can cause panics if the response structure is unexpected. A past review on config_update_sync_test.go:54-66 identified this same issue and recommended adding defensive checks or using helper functions.

Consider creating a shared helper function to safely extract nested values with proper error handling, then apply it consistently across all rate limit tests.

plugins/governance/advanced_scenarios_test.go (1)

1212-1214: Inconsistent cascade deletion handling for rate limits.

The test asserts that budgets are cascade-deleted (Lines 1204-1207) but only logs a note for rate limits (Lines 1212-1214), which is inconsistent. A past review comment flagged this exact issue, noting that if cascade deletion should remove budgets, it should likely remove rate limits too, or there should be a clear explanation for the different behavior.

🧹 Nitpick comments (3)
plugins/governance/in_memory_sync_test.go (1)

86-88: Long sleeps make this suite slow; consider configurable tick durations.

Several tests rely on fixed time.Sleep values (500ms–2s, plus 1s around /data checks). With t.Parallel, these add up and will slow CI, especially as more governance tests are added.

If possible, consider wiring the in‑memory sync ticker interval via config/env for tests so you can:

  • Use much shorter durations (e.g., 50–100ms) in tests, or
  • Poll with a bounded retry loop instead of fixed sleeps.

This keeps the tests deterministic while reducing wall‑clock time.

Also applies to: 196-198, 305-306, 402-404, 476-477

plugins/governance/usage_tracking_test.go (1)

113-129: Long 35s sleeps will significantly slow the suite; consider configurable reset intervals.

The reset‑ticker tests (TestUsageTrackingRateLimitReset, TestUsageTrackingBudgetReset, TestResetTickerBothBudgetAndRateLimit) each sleep for 35 seconds to outwait a 30s ticker. With several such tests (and others using 30s+ waits), this can quickly dominate CI time.

If feasible, make the ticker interval configurable in tests so you can:

  • Run the ticker at e.g. 500ms or 1s in test mode, and
  • Reduce these sleeps proportionally (e.g., 2–3× the interval), or use a small polling loop with a timeout.

This keeps behavior the same while making the governance test suite much faster and less flaky.

Also applies to: 223-247, 383-431

plugins/governance/test_utils.go (1)

144-152: Inefficient random seeding on every ID generation.

Calling rand.Seed(time.Now().UnixNano()) on every invocation of generateRandomID() is inefficient and can produce duplicate IDs if called in quick succession. The global random source should be seeded once (e.g., in an init() function) or use a local rand.Source with rand.New().

Apply this diff:

+func init() {
+	rand.Seed(time.Now().UnixNano())
+}
+
 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da59879 and bc8095a.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • plugins/governance/edge_cases_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/go.mod
  • plugins/governance/fixtures_test.go
  • plugins/governance/tracker_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/team_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/changelog.md
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/team_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/test_utils.go
🧬 Code graph analysis (8)
plugins/governance/store_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (66-85)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (355-362)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
ui/lib/types/governance.ts (4)
  • Customer (37-44)
  • Team (27-35)
  • Budget (5-11)
  • RateLimit (13-25)
plugins/governance/rate_limit_enforcement_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
plugins/governance/usage_tracking_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
plugins/governance/rate_limit_test.go (3)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (13-13)
  • Content (922-930)
plugins/governance/e2e_test.go (5)
plugins/governance/test_utils.go (11)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CalculateCost (64-73)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
  • CreateRateLimitRequest (202-207)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (13-13)
  • Content (922-930)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/customer_budget_test.go (1)
plugins/governance/test_utils.go (11)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateTeamRequest (183-187)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (1)
plugins/governance/changelog.md (1)

1-3: Changelog entries look good.

The previous issues (grammatical error on line 1 and typo on line 3) have been successfully addressed. All three entries are now correct and properly formatted.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from bc8095a to 06fbe4a Compare December 12, 2025 19:58
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-08-feat_governance_plugin_refactor branch from 909ae57 to b1a0d9c Compare December 12, 2025 19:58
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (8)
plugins/governance/test_utils.go (1)

90-141: Add t.Helper() + HTTP timeouts (tests can hang indefinitely).
MakeRequest uses a zero-timeout http.Client and doesn’t call t.Helper(), so failures can hang and stack traces point at the helper.

 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+	t.Helper()
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)
plugins/governance/provider_budget_test.go (1)

58-145: Remove shouldStop: it can prevent the “budget exceeded” request from ever being sent.
The loop breaks on the iteration after consumedBudget >= providerBudget, potentially exiting before seeing the API rejection that the test expects.

 	t.Run("OpenAIProviderBudgetExceeded", func(t *testing.T) {
@@
-		var shouldStop = false
-
 		for requestNum <= 50 {
@@
 			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
 		}
@@
 	t.Run("AnthropicProviderBudgetExceeded", func(t *testing.T) {
@@
-		var shouldStop = false
-
 		for requestNum <= 50 {
@@
 			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
 		}

Also applies to: 148-235

plugins/governance/config_update_sync_test.go (2)

54-63: Replace chained type assertions with safe extraction helpers (tests currently panic).
Many traversals like resp.Body["data"].(map[string]interface{})... can panic if the response shape changes or contains an error body.

Also applies to: 99-107, 142-150, 190-199


166-210: The “higher limit (usage should NOT reset)” section doesn’t assert the behavior.
At this point usage was already reset, and the code only logs; it won’t fail if behavior regresses. Either make usage non-zero before raising the limit and assert preservation, or remove/rename the branch.

plugins/governance/edge_cases_test.go (1)

59-98: Budget hierarchy expectation is inconsistent: VK budget ($0.01) is tighter than provider budget ($0.08).
If the VK budget is smaller, it should be exhausted first—contradicting “provider budget should be the limiting factor”. Fix either the numbers or the expectation/logging.

plugins/governance/store_test.go (1)

144-155: Don’t ignore NewLocalGovernanceStore errors in the per-case store creation.
If store creation fails, the test can panic or misreport failures.

-			testStore, _ := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
+			testStore, err := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
 				VirtualKeys: []configstoreTables.TableVirtualKey{*testVK},
 				Budgets:     []configstoreTables.TableBudget{*testBudget},
 			})
+			require.NoError(t, err)
plugins/governance/in_memory_sync_test.go (1)

368-398: ** Potential double-delete on cleanup.**

The VK is added to testData at line 369, then manually deleted at lines 391-398. When testData.Cleanup(t) runs via defer at line 348, it may attempt to delete the already-deleted VK, potentially causing test noise or failures.

Either remove from testData after manual deletion, or skip adding it since manual deletion is the test's purpose.

plugins/governance/advanced_scenarios_test.go (1)

1212-1214: ** Inconsistent cascade deletion for rate limits.**

The test correctly asserts that budgets are cascade-deleted (lines 1204-1207), but rate limits are still allowed to be orphaned with just a log message. This inconsistency should be resolved - either assert cascade deletion for both or document why rate limits may be orphaned.

 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}

If orphaning is intentional for rate limits, add a clear comment explaining the design decision.

🧹 Nitpick comments (11)
plugins/governance/test_utils.go (2)

143-152: Avoid reseeding global math/rand per call; reduce collision risk for parallel tests.
Seeding inside generateRandomID() is unnecessary and makes uniqueness more time-dependent; with many parallel tests, a stronger/monotonic source is safer.

 import (
 	"bytes"
+	"crypto/rand"
 	"encoding/json"
+	"encoding/hex"
 	"fmt"
 	"io"
-	"math/rand"
 	"net/http"
 	"strings"
 	"testing"
 	"time"
 )
@@
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
-	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
-	b := make([]byte, 8)
-	for i := range b {
-		b[i] = letters[rand.Intn(len(letters))]
-	}
-	return string(b)
+	b := make([]byte, 4) // 8 hex chars
+	if _, err := rand.Read(b); err != nil {
+		// last-resort fallback: still deterministic-ish, but avoids panics in tests
+		return fmt.Sprintf("%x", time.Now().UnixNano())[:8]
+	}
+	return hex.EncodeToString(b)
 }

330-367: Cleanup should be “best effort” (avoid masking the real failure).
If the server is down/hung, MakeRequest currently t.Fatalfs during cleanup, which can obscure the original test failure. Consider a non-fatal cleanup path (e.g., a MakeRequestNoFail used only from Cleanup).

plugins/governance/provider_budget_test.go (2)

45-54: Harden JSON shape parsing to fail cleanly (avoid panics on unexpected responses).
There are multiple unchecked type assertions (.(map[string]interface{}), .(string)) and ignored CalculateCost errors; a non-JSON error body or schema change will panic instead of failing with context.

Also applies to: 115-131, 205-221


9-13: t.Parallel() + shared localhost:8080 can make these tests flaky.
If other governance tests run in parallel against the same server/state, budget consumption and cleanup can interfere. Consider disabling parallelism for e2e-style tests or isolating per-test server/state.

plugins/governance/config_update_sync_test.go (1)

90-92: Avoid fixed time.Sleep(500ms); poll with timeout instead.
Hard sleeps are a common source of CI flakes; prefer a small polling loop with an overall deadline and a clear failure message.

Also applies to: 133-135, 183-184, 292-294, 454-455

plugins/governance/in_memory_sync_test.go (2)

87-87: Consider increasing sleep duration for async updates.

The 500ms sleep may be insufficient for async in-memory updates in a loaded test environment. Consider increasing to at least 1 second or implementing a retry loop to verify the update occurred.


36-64: Missing nil checks on type assertions.

Lines 36, 51-52, 55, 60-61 perform type assertions without checking the ok return value. If the response structure is unexpected, these will panic instead of providing useful error messages.

-	vk := createVKResp.Body["virtual_key"].(map[string]interface{})
-	vkValue := vk["value"].(string)
+	vk, ok := createVKResp.Body["virtual_key"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Failed to extract virtual_key from response")
+	}
+	vkValue, ok := vk["value"].(string)
+	if !ok {
+		t.Fatalf("Failed to extract VK value")
+	}

Apply similar checks throughout the file where type assertions are used.

plugins/governance/usage_tracking_test.go (2)

393-393: String conversion only works for i < 10.

string(rune('0'+i)) produces incorrect characters when i >= 10. For the loop at line 384 with i from 0-2, this is safe, but the pattern is fragile if the loop bounds change.

-						Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+						Content: fmt.Sprintf("Test request %d for reset ticker test.", i),

This requires adding "fmt" to imports.


512-512: Fragile string conversion in loop.

Same issue as above - string(rune('0'+i)) will break for i >= 10. The loop runs from 0-1, so currently safe but fragile.

-						Content: "Persistence test request " + string(rune('0'+i)) + ".",
+						Content: fmt.Sprintf("Persistence test request %d.", i),
plugins/governance/advanced_scenarios_test.go (1)

92-145: Complex loop control with redundant flag.

The loop uses both requestNum <= 150 and a shouldStop flag that's set when consumedBudget >= teamBudget. The flag is checked after incrementing requestNum, but the loop could simply break when the budget is exceeded.

-	shouldStop := false
-
 	for requestNum <= 150 {
 		resp := MakeRequest(t, APIRequest{...})
 
 		if resp.StatusCode >= 400 {
 			if CheckErrorMessage(t, resp, "budget") {
 				t.Logf("VK1 request %d rejected: team budget exceeded at $%.6f/$%.4f", requestNum, consumedBudget, teamBudget)
 				break
 			}
 			// ...
 		}
 		
 		// Extract cost...
 		requestNum++
 
-		if shouldStop {
-			break
-		}
-
 		if consumedBudget >= teamBudget {
-			shouldStop = true
+			// Make one more request to trigger rejection
+			continue
 		}
 	}

This pattern repeats in multiple tests. Consider simplifying.

plugins/governance/e2e_test.go (1)

299-299: Fragile string conversion for request numbering.

string(rune('0'+requestNum%10)) only produces correct digit characters for values 0-9. While the modulo 10 keeps it in range, this is unnecessarily complex.

-						Content: "Test hierarchy enforcement request " + string(rune('0'+requestNum%10)),
+						Content: fmt.Sprintf("Test hierarchy enforcement request %d", requestNum),

Add "fmt" to imports if not already present (it is at line 4).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc8095a and 06fbe4a.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • plugins/governance/changelog.md
  • plugins/governance/vk_budget_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/config_update_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/store_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/go.mod
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/test_utils.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/config_update_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/store_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/test_utils.go
🧬 Code graph analysis (8)
plugins/governance/config_update_sync_test.go (1)
plugins/governance/test_utils.go (13)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
  • CreateCustomerRequest (190-193)
plugins/governance/provider_budget_test.go (1)
plugins/governance/test_utils.go (10)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ProviderConfigRequest (167-174)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/store_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (66-85)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (705-712)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
ui/lib/types/governance.ts (4)
  • Customer (37-44)
  • Team (27-35)
  • Budget (5-11)
  • RateLimit (13-25)
plugins/governance/advanced_scenarios_test.go (4)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CreateCustomerRequest (190-193)
  • UpdateBudgetRequest (196-199)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
  • ProviderConfigRequest (167-174)
  • CreateRateLimitRequest (202-207)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (15-15)
  • Content (975-983)
plugins/governance/e2e_test.go (5)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
  • CreateRateLimitRequest (202-207)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (15-15)
  • Content (975-983)
plugins/governance/usage_tracking_test.go (1)
plugins/governance/test_utils.go (8)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (5)
plugins/governance/go.mod (2)

7-11: testify v1.11.1 addition is appropriate and aligns with existing patterns.

All test files in plugins/governance already import and use testify (assert and require exclusively); the go.mod change formalizes this as an explicit dependency. No inconsistency exists.


38-38: Indirect deps are transitive from testify; keep them // indirect and let go mod tidy own them. Lines 38 and 81 correctly mark go-spew and go-difflib as indirect dependencies. The go mod why output confirms these are pulled in via stretchr/testify/assert with clean dependency chains, indicating they came from a standard go mod tidy operation without manual pinning.

plugins/governance/store_test.go (1)

69-101: Concurrent read test looks solid for the intended lock-free guarantee.
Good use of WaitGroup + atomics to validate concurrent reads without false positives.

plugins/governance/e2e_test.go (2)

809-836: Goroutine logging may cause test noise.

While the past issue about t.Fatalf in goroutines was addressed, the current implementation doesn't report failures from goroutines. If MakeRequest encounters errors in the request setup (not just non-200 responses), those will call t.Fatalf from within a goroutine, which only exits that goroutine and may leave the test hanging.

Consider either:

  1. Making the test more resilient to partial failures (current approach is acceptable for e2e tests)
  2. Documenting that this test verifies "no crashes under concurrent load" rather than "all requests succeed"

Current implementation appears acceptable for its purpose.


1356-1363: Model detection logic is fragile but acceptable for testing.

The test checks for "openai" or "gpt-4o" in the model string to detect which provider was used. While this is fragile and relies on string matching, it's acceptable for an e2e test that verifies weighted distribution behavior.

For production code or more robust testing, consider using explicit provider information from the response metadata (like ExtraFields.Provider if available), but for this e2e scenario the current approach suffices.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

♻️ Duplicate comments (16)
plugins/governance/resolver_test.go (1)

331-345: Direct manipulation of internal store state bypasses encapsulation.

This code directly accesses store.budgets.Store() to modify budget usage, coupling the test to implementation details. A test helper method would be cleaner.

This was flagged in a previous review. Consider adding a test-only helper like SetBudgetCurrentUsageForTest(budgetID string, usage float64) to the LocalGovernanceStore to avoid directly accessing internal fields.

plugins/governance/team_budget_test.go (1)

70-156: Remove shouldStop flag - same issue as flagged in previous review.

The shouldStop logic causes the loop to exit before sending the over-budget request that would verify enforcement. This was flagged in a previous review on this file. Remove the shouldStop variable and related checks (lines 70, 149-155) to let the API enforcement terminate the loop.

plugins/governance/provider_budget_test.go (2)

63-141: Remove shouldStop flag in OpenAI subtest - same issue as flagged in previous review.

The shouldStop logic (lines 63, 134-140) causes early exit before the budget-exceeded request is sent. This was flagged in a previous review. Remove the shouldStop variable and related checks.


153-231: Same shouldStop issue in Anthropic subtest.

Apply the same fix to remove shouldStop logic at lines 153, 224-230 as flagged in the previous review.

plugins/governance/edge_cases_test.go (1)

59-79: VK budget ($0.01) is more restrictive than provider budget ($0.08) - test expectation incorrect.

This was flagged in a previous review. The test comment on line 97 states "Provider budget should be the limiting factor (most restrictive)", but the VK budget ($0.01) is actually smaller than the provider budget ($0.08). The VK budget would be exhausted first.

Either swap the values to make provider budget genuinely the most restrictive:

-	vkBudget := 0.01
+	vkBudget := 0.15

Or make the provider budget smaller:

 					Budget: &BudgetRequest{
-						MaxLimit:      0.08, // Even tighter provider budget
+						MaxLimit:      0.005, // Most restrictive - provider budget
plugins/governance/usage_tracking_test.go (2)

250-340: “In-memory usage update” test still doesn’t assert anything meaningful.
This can pass even if no in-memory counter changes (it logs “optional”). Consider creating the VK with a known rate limit (or budget) and asserting the corresponding token_current_usage/request_current_usage (or current_usage) increases after the request (use bounded polling to wait for async update).


383-405: Replace fragile string(rune('0'+i)) request numbering.
Use fmt.Sprintf("... %d ...", i+1) (and add fmt import).

Also applies to: 501-525

plugins/governance/store_test.go (1)

144-162: Don’t ignore NewLocalGovernanceStore errors in the table test.
Capture and require.NoError(t, err) before using testStore.

plugins/governance/in_memory_sync_test.go (1)

345-419: Avoid double-delete of VK during cleanup.
The test adds vkID to testData and then deletes the VK; testData.Cleanup(t) may delete again. Either don’t add it, or remove it from the cleanup list, or make cleanup ignore 404/not-found.

plugins/governance/rate_limit_enforcement_test.go (2)

431-517: Replace fragile string(rune('0'+i)) request numbering.
Use fmt.Sprintf/strconv.Itoa.


520-612: Assert in-memory usage counters increased (not just rate_limit_id).
Fetch data["rate_limits"][rateLimitID] and assert token_current_usage >= tokensUsed (or at least > 0).

plugins/governance/rate_limit_test.go (1)

45-59: Replace fragile string(rune('0'+i)) request numbering.
Prefer fmt.Sprintf or strconv.Itoa.

Also applies to: 118-133, 197-212, 276-291, 633-647

plugins/governance/advanced_scenarios_test.go (1)

1209-1214: Make cascade-deletion expectations consistent for rate limits (don’t just log).
If rate limits are expected to be deleted, assert they’re gone; if orphaning is expected, document that explicitly and adjust the test name/expectation.

plugins/governance/config_update_sync_test.go (2)

48-60: Avoid chained type assertions; fail cleanly with helpers.
Add a small extractor helper (map navigation + typed reads) so these tests don’t panic when the response shape changes or async state hasn’t landed yet.

Also applies to: 93-106, 136-148, 190-199, 251-265, 409-424, 456-473, 506-523, 598-613, 642-653, 681-693, 785-806, 829-841, 869-880, 943-966, 989-1004, 1034-1046


166-213: “Higher limit shouldn’t reset usage” isn’t asserted.
Right now it mostly logs and the state is already reset to 0. Make usage non-zero before raising the limit, then assert usage is preserved (or update the comment/branch to match what you’re actually testing).

plugins/governance/test_utils.go (1)

91-92: Add HTTP client timeout and t.Helper() to MakeRequest.

The previous review flagged that the bare http.Client has no timeout, which can cause tests to hang indefinitely if the server is unresponsive. This issue appears unaddressed. Also missing t.Helper() for proper test failure attribution.

 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+	t.Helper()
-	client := &http.Client{}
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)
🧹 Nitpick comments (9)
plugins/governance/usage_tracking_test.go (1)

9-12: These integration tests + long sleeps shouldn’t be t.Parallel() by default.
Running many 35s-sleep tests in parallel against a shared server/db is likely to be flaky/slow. Consider gating with an env var (or build tag) and/or removing t.Parallel() for the long-sleep ticker tests.

Also applies to: 113-116, 223-226, 314-316, 428-431

plugins/governance/fixtures_test.go (1)

14-67: MockLogger should format args (currently drops them).
Store fmt.Sprintf(format, args...) (and consider keeping level/output fields if interface expects them).

plugins/governance/in_memory_sync_test.go (1)

9-127: Reduce flakiness: prefer bounded polling over fixed sleeps for async in-memory sync.
Instead of time.Sleep(500ms/2s), poll /api/governance/data until the expected condition (or timeout) to avoid timing flakes.

Also applies to: 129-234, 237-343, 421-527

plugins/governance/test_utils.go (2)

143-152: Remove deprecated rand.Seed call.

rand.Seed is deprecated since Go 1.20. The global math/rand source is automatically seeded. Additionally, calling rand.Seed per invocation in parallel tests can cause non-determinism or race conditions.

 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

247-268: Consider adding t.Helper() to test helper functions.

ExtractIDFromResponse and CheckErrorMessage call t.Fatalf but lack t.Helper(), causing failure line numbers to point to the helper instead of the calling test.

 func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {
 func CheckErrorMessage(t *testing.T, resp *APIResponse, expectedText string) bool {
+	t.Helper()
 	if resp.StatusCode < 400 {
plugins/governance/e2e_test.go (4)

67-68: Unchecked type assertions may cause test panics.

If the response structure differs from expected, these assertions will panic instead of providing a clear test failure message.

-	vk1 := createVK1Resp.Body["virtual_key"].(map[string]interface{})
-	vk1Value := vk1["value"].(string)
+	vk1Data, ok := createVK1Resp.Body["virtual_key"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Expected virtual_key in response, got: %v", createVK1Resp.Body)
+	}
+	vk1Value, ok := vk1Data["value"].(string)
+	if !ok {
+		t.Fatalf("Expected value in virtual_key, got: %v", vk1Data)
+	}

This pattern repeats at lines 88-89, 274-275, 387-388, 527-528, 676-677, 792-793, 889-890, 1078-1079, 1180-1181, 1300-1301, 1430-1431. Consider extracting a helper like ExtractVKValue(t, resp) string to reduce duplication.


297-302: Minor: Prefer fmt.Sprintf over rune arithmetic for readability.

The expression string(rune('0'+requestNum%10)) works but is less readable than a direct format string.

-				Content: "Test hierarchy enforcement request " + string(rune('0'+requestNum%10)),
+				Content: fmt.Sprintf("Test hierarchy enforcement request %d", requestNum),

1232-1255: Consider asserting that deleted VK requests are rejected.

The test logs both success and rejection cases without asserting expected behavior. If the expected behavior is rejection, the assertion at line 1248 should fail the test on success.

 	if resp.StatusCode < 400 {
-		t.Logf("Deleted VK still accepts requests (status=%d) - may be cached in SDK", resp.StatusCode)
+		t.Errorf("Deleted VK should reject requests but got status %d", resp.StatusCode)
 	} else {
 		t.Logf("Deleted VK request rejected (status=%d) ✓", resp.StatusCode)
 	}

1353-1363: Provider detection logic remains fragile.

The detection at line 1356 counts any model containing "gpt-4o" as OpenAI, but Azure also returns "gpt-4o" in the model field. This conflates the providers. Consider using ExtraFields.Provider from the response if available, or parsing the model prefix more precisely.

 		if model, ok := resp.Body["model"].(string); ok {
-			if strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o") {
+			if strings.HasPrefix(model, "openai/") {
 				openaiCount++
 				t.Logf("Request %d routed to OpenAI (model: %s)", i+1, model)
-			} else if strings.Contains(model, "azure") {
+			} else if strings.HasPrefix(model, "azure/") {
 				azureCount++
 				t.Logf("Request %d routed to Azure (model: %s)", i+1, model)
+			} else {
+				t.Logf("Request %d routed to unknown provider (model: %s)", i+1, model)
 			}
 		}

Alternatively, check if the response includes provider information in a separate field.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc8095a and 06fbe4a.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/governance/changelog.md
  • plugins/governance/tracker_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/provider_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/go.mod
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/e2e_test.go
  • plugins/governance/usage_tracking_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/provider_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/e2e_test.go
  • plugins/governance/usage_tracking_test.go
🧬 Code graph analysis (9)
plugins/governance/resolver_test.go (6)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (705-712)
plugins/governance/resolver.go (11)
  • NewBudgetResolver (70-75)
  • DecisionAllow (18-18)
  • DecisionVirtualKeyNotFound (19-19)
  • DecisionVirtualKeyBlocked (20-20)
  • DecisionProviderBlocked (26-26)
  • DecisionModelBlocked (25-25)
  • DecisionTokenLimited (23-23)
  • DecisionRequestLimited (24-24)
  • DecisionBudgetExceeded (22-22)
  • DecisionRateLimited (21-21)
  • Decision (15-15)
core/schemas/context.go (1)
  • BifrostContext (30-40)
ui/lib/types/governance.ts (5)
  • VirtualKey (61-80)
  • RateLimit (13-25)
  • Budget (5-11)
  • Customer (37-44)
  • Team (27-35)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (101-101)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/team_budget_test.go (3)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (15-15)
  • Content (975-983)
plugins/governance/rate_limit_enforcement_test.go (1)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
plugins/governance/fixtures_test.go (3)
core/schemas/logger.go (2)
  • LogLevel (6-6)
  • LoggerOutputType (17-17)
ui/lib/types/governance.ts (3)
  • Budget (5-11)
  • RateLimit (13-25)
  • VirtualKey (61-80)
plugins/governance/resolver.go (1)
  • Decision (15-15)
plugins/governance/in_memory_sync_test.go (2)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
plugins/governance/rate_limit_test.go (3)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (15-15)
  • Content (975-983)
plugins/governance/config_update_sync_test.go (2)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • CreateVirtualKeyRequest (155-164)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
plugins/governance/e2e_test.go (2)
plugins/governance/test_utils.go (4)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • ExtractIDFromResponse (248-268)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (10)
plugins/governance/resolver_test.go (1)

1-550: Comprehensive resolver test coverage.

The test suite provides thorough coverage of the BudgetResolver.EvaluateRequest workflow including:

  • Happy path and various denial scenarios (VK not found, blocked, provider/model filtering)
  • Rate limit enforcement (token, request, and expiration/reset)
  • Budget enforcement (single-level and multi-level hierarchy)
  • Context population verification

The use of builder helpers (buildVirtualKey, buildBudgetWithUsage, etc.) keeps tests readable and maintainable.

plugins/governance/go.mod (1)

8-10: Remove reference to non-existent base branch.

The base branch 12-08-feat_governance_plugin_refactor does not exist in this repository. The review comment cannot be verified as written since it assumes this PR targets that branch as part of a stack. Confirm the actual target branch for this PR before reviewing the version changes.

The addition of testify v1.11.1 for test assertions is appropriate.

Likely an incorrect or invalid review comment.

plugins/governance/test_utils.go (1)

330-367: LGTM! Cleanup handles deletion order and 404 tolerance correctly.

The cleanup deletes VKs first, then teams, then customers—matching the dependency hierarchy. The 404 tolerance prevents failures when resources are already deleted by cascade.

plugins/governance/e2e_test.go (7)

360-495: LGTM! Test correctly verifies failed requests don't consume budget.

The test properly uses the in-memory data endpoint to verify budget state before and after failed/successful requests, with appropriate tolerance for floating-point comparison.


503-640: LGTM! Test covers the full VK activation lifecycle.

The test properly verifies active → inactive → active state transitions with appropriate wait times for in-memory synchronization.


647-755: LGTM! Rate limit boundary test is well-structured.

The test correctly validates rate limit enforcement and reset timing. The 16-second wait ensures the reset occurs.


809-836: MakeRequest can still call t.Fatalf from within goroutines.

While the test no longer calls t.Fatalf directly in the goroutine, MakeRequest uses t.Fatalf for connection failures (line 118 in test_utils.go). If the server becomes unavailable during the test, t.Fatalf from a goroutine will only terminate that goroutine, potentially causing the test to hang or behave unexpectedly.

Consider using a version of MakeRequest that returns an error for concurrent usage, or handle the potential hang with a test timeout.

// Option: Add test timeout
func TestConcurrentRequestsToSameVK(t *testing.T) {
	t.Parallel()
	// Add deadline to prevent hang if MakeRequest fails in goroutine
	ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
	defer cancel()
	// ... rest of test
}

860-1025: LGTM! Budget reset test properly handles lazy reset mechanism.

The test correctly documents and validates the lazy budget reset behavior, including making a request to trigger the reset and verifying the usage drops appropriately.


1120-1139: Test doesn't assert specific expected behavior after team deletion.

The comment acknowledges uncertainty about the expected behavior ("VK may or may not work depending on implementation"). Consider documenting the intended behavior and asserting it to prevent regressions.

What is the expected behavior when a VK's parent team is deleted? Should the VK:

  1. Be deleted as part of cascade?
  2. Continue working without team budget constraints?
  3. Be blocked until reassigned?

Once clarified, update the test to assert the expected behavior.


1388-1487: LGTM! Fallback mechanism test is well-designed.

The test correctly verifies that requests for a model not supported by the primary provider (Anthropic) fall back to a secondary provider (OpenAI) that supports it.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from 06fbe4a to be3d417 Compare December 14, 2025 19:19
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-08-feat_governance_plugin_refactor branch 2 times, most recently from 22e0c14 to 3544b92 Compare December 14, 2025 19:26
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from be3d417 to 4325d29 Compare December 14, 2025 19:26
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (25)
plugins/governance/resolver_test.go (1)

331-345: Encapsulation bypass when setting budget usage for test fixture.

This directly manipulates the internal store.budgets sync.Map, coupling the test to implementation details. Consider adding a test helper method to LocalGovernanceStore for setting budget usage in tests.

-	// Get the governance data to update the budget directly
-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	// Use a test helper to set budget usage for test fixture
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)
plugins/governance/team_budget_test.go (1)

66-156: Remove shouldStop flag to ensure budget enforcement is actually tested.

The shouldStop logic causes the loop to exit before sending a request that would trigger server-side budget rejection. When consumedBudget >= teamBudget, shouldStop is set, and on the next iteration the loop breaks—but the over-budget request is never sent to verify enforcement.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/vk_budget_test.go (1)

46-127: Remove shouldStop flag to ensure VK budget enforcement is tested.

Same issue as other budget tests: the shouldStop logic (lines 46, 120-126) causes the loop to exit before sending the request that would trigger server-side budget rejection. Let the API's enforcement (or the 50-request cap) terminate the loop.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= vkBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/customer_budget_test.go (2)

70-156: Remove shouldStop flag in TestCustomerBudgetExceededWithMultipleVKs.

Same issue as other budget tests: the shouldStop logic causes the loop to exit before sending the request that would trigger server-side budget rejection.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= customerBudget {
-			shouldStop = true
-		}
 	}

245-331: Same shouldStop issue in TestCustomerBudgetExceededWithMultipleTeams.

Apply the same fix to remove the shouldStop logic at lines 245, 324-330.

plugins/governance/provider_budget_test.go (2)

63-140: Remove shouldStop flag in OpenAI subtest.

Same loop termination issue as other budget tests. The shouldStop flag causes early exit before the budget-exceeded request is sent.

 		providerBudget := 0.01
 		consumedBudget := 0.0
 		requestNum := 1
 		var lastSuccessfulCost float64
-		var shouldStop = false

 		for requestNum <= 50 {
 			// ... request logic ...

 			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
 		}

153-230: Same shouldStop issue in Anthropic subtest.

Apply the same fix to remove the shouldStop logic at lines 153, 224-230.

plugins/governance/usage_tracking_test.go (3)

392-394: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. While the current loop bounds (i < 3) are safe, this is a fragile pattern.

-					Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+					Content: fmt.Sprintf("Test request %d for reset ticker test.", i),

Note: Requires adding "fmt" to imports.


8-129: Test lacks actual assertion for rate-limit reset.

TestUsageTrackingRateLimitReset waits 35 seconds but only logs "Rate limit reset should have occurred" (Line 128) without verifying that token_current_usage actually reset to 0. The test passes regardless of whether the reset happened.

Add an assertion that verifies the reset:

// After line 125, add:
data3 := getDataResp3.Body["data"].(map[string]interface{})
rateLimitsMap3 := data3["rate_limits"].(map[string]interface{})
rateLimit3 := rateLimitsMap3[rateLimitID].(map[string]interface{})
tokenUsageAfterReset, _ := rateLimit3["token_current_usage"].(float64)

if tokenUsageAfterReset > 0 {
    t.Fatalf("Rate limit not reset: token_current_usage is %.0f (expected 0)", tokenUsageAfterReset)
}
t.Logf("Rate limit reset verified: token_current_usage=%.0f ✓", tokenUsageAfterReset)

250-340: Test creates VK without limits but checks for rate_limit_id.

TestInMemoryUsageUpdateOnRequest creates a VK with no rate limit configuration (Lines 258-264), then checks rate_limit_id at Line 332. This will always be empty, making the conditional branch at Lines 333-337 always take the "No rate limit on VK" path. The test doesn't actually verify any meaningful usage tracking behavior.

Either add a rate limit to the VK:

 createVKResp := MakeRequest(t, APIRequest{
     Method: "POST",
     Path:   "/api/governance/virtual-keys",
     Body: CreateVirtualKeyRequest{
         Name: vkName,
+        RateLimit: &CreateRateLimitRequest{
+            TokenMaxLimit:      &tokenLimit,
+            TokenResetDuration: &tokenResetDuration,
+        },
     },
 })

Or add a budget and verify budget usage tracking instead.

plugins/governance/rate_limit_enforcement_test.go (3)

488-489: Fragile string conversion for request numbering.

Same issue flagged elsewhere: string(rune('0'+i)) only works correctly for i in range 0-9.

-					Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+					Content: fmt.Sprintf("Request %d to test both limits.", i+1),

602-611: Incomplete verification of usage tracking.

The test extracts rate_limit_id but doesn't verify that usage counters were actually incremented. The assertion at Line 607-611 only confirms the rate limit exists, not that it's tracking usage.

+rateLimitsMap := data["rate_limits"].(map[string]interface{})
 if rateLimitID != "" {
+    rateLimitData := rateLimitsMap[rateLimitID].(map[string]interface{})
+    tokenUsage, _ := rateLimitData["token_current_usage"].(float64)
+    if tokenUsage < float64(tokensUsed) {
+        t.Fatalf("Token usage not tracked: expected >= %d, got %.0f", tokensUsed, tokenUsage)
+    }
+    t.Logf("Rate limit %s tracked %.0f tokens (used %d) ✓", rateLimitID, tokenUsage, tokensUsed)
-    t.Logf("Rate limit %s is configured and tracking usage ✓", rateLimitID)
 } else {
-    t.Logf("Rate limit is configured ✓")
+    t.Fatalf("Rate limit ID not found for VK - cannot verify usage tracking")
 }

205-213: Rate-limit rejection should assert specific status code.

The test checks resp2.StatusCode < 400 (Line 205) but then only logs when status is 429 (Lines 209-213). If the rejection is due to a 500 or other error, the test could pass incorrectly. Assert 429 explicitly or fail on unexpected error codes.

-if resp2.StatusCode < 400 {
-    t.Fatalf("Second request should be rejected (429) but got status %d", resp2.StatusCode)
-}
-
-if resp2.StatusCode == 429 {
+if resp2.StatusCode == 429 {
     if !CheckErrorMessage(t, resp2, "request") && !CheckErrorMessage(t, resp2, "rate") {
         t.Fatalf("Rate limit error message missing expected keywords: %v", resp2.Body)
     }
+} else if resp2.StatusCode >= 400 {
+    // Check if it's a rate-limit error with different status code
+    if !CheckErrorMessage(t, resp2, "request") && !CheckErrorMessage(t, resp2, "rate") {
+        t.Fatalf("Second request failed with non-rate-limit error (status %d): %v", resp2.StatusCode, resp2.Body)
+    }
+} else {
+    t.Fatalf("Second request should be rejected but got status %d", resp2.StatusCode)
 }
plugins/governance/edge_cases_test.go (2)

59-96: Budget hierarchy values contradict the test expectation.

The test comment on Line 97 states "Provider budget should be the limiting factor (most restrictive)", but the actual budget values are:

  • VK budget: $0.01 (Line 60)
  • Provider budget: $0.08 (Line 76)

The VK-level budget ($0.01) is actually more restrictive than the provider-level budget ($0.08). Either adjust the budget values to make the provider budget the most restrictive, or update the test expectation.

-	// Create a VK with even tighter budget and provider-specific budgets
-	vkBudget := 0.01
+	// Create a VK with moderate budget; provider budget is tighter
+	vkBudget := 0.15

Or make the provider budget more restrictive:

 				Budget: &BudgetRequest{
-					MaxLimit:      0.08, // Even tighter provider budget
+					MaxLimit:      0.005, // Most restrictive - provider budget
 					ResetDuration: "1h",
 				},

101-180: Remove shouldStop flag - early exit prevents observing budget enforcement.

The shouldStop logic (Lines 101, 173-179) causes the loop to exit immediately after consumption reaches the threshold, before sending the request that would actually be rejected by the server. This means the test may never observe the actual budget enforcement error.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
-		}
 	}
plugins/governance/tracker_test.go (1)

154-155: Mismatched comment and function name.

The comment says TestUsageTracker_UpdateBudgetHierarchy but the function is named TestUsageTracker_Cleanup.

-// TestUsageTracker_UpdateBudgetHierarchy tests multi-level budget updates
+// TestUsageTracker_Cleanup tests cleanup behavior
 func TestUsageTracker_Cleanup(t *testing.T) {
plugins/governance/config_update_sync_test.go (2)

48-66: Missing HTTP status check and unsafe type assertions.

The code accesses response body fields without first verifying the status code (Line 49-52) and uses chained type assertions without ok checks (Lines 54-59). If the server returns an error, the test will panic rather than fail gracefully.

 getDataResp1 := MakeRequest(t, APIRequest{
     Method: "GET",
     Path:   "/api/governance/data",
 })

+if getDataResp1.StatusCode != 200 {
+    t.Fatalf("Failed to get governance data: status %d, body: %v", getDataResp1.StatusCode, getDataResp1.Body)
+}
+
-data1 := getDataResp1.Body["data"].(map[string]interface{})
-vkData1 := data1["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})
+data1, ok := getDataResp1.Body["data"].(map[string]interface{})
+if !ok {
+    t.Fatalf("Failed to extract data from response")
+}

This pattern appears throughout the file (~20 locations). Consider creating a helper function for safe extraction.


166-210: "Higher limit" test branch doesn't verify preservation behavior.

The comment at Line 166 says "usage should NOT reset" when updating to a higher limit, but by this point usage was already reset to 0 by the previous lower-limit update. The test only logs tokenUsageAfterSecondUpdate (Line 209) without asserting anything meaningful.

To actually test preservation behavior, consume tokens after the first update (before raising the limit), then verify usage is preserved after the higher-limit update:

// After line 164, add a request to consume tokens before raising the limit:
resp2 := MakeRequest(t, APIRequest{...}) // consume some tokens
// Then verify usage is preserved after the higher-limit update
plugins/governance/rate_limit_test.go (1)

43-79: Rate-limit tests can pass without ever observing enforcement — make “no rejection observed” a hard failure.
Currently the loops can finish without hitting a limit and the test still logs/skips. Switch to t.Fatalf(...) when the loop ends without observing the expected rate-limit rejection (and avoid t.Skip unless you intentionally gate integration tests).

@@
- if successCount > 0 {
-     t.Logf("Made %d successful requests before hitting token limit ✓", successCount)
- } else {
-     t.Skip("Could not make requests to test token limit")
- }
+ t.Fatalf("rate limit not enforced: no rejection observed (successCount=%d)", successCount)

Also applies to: 116-152, 195-231, 274-310

plugins/governance/advanced_scenarios_test.go (3)

14-17: These scenarios are too heavy/flaky to run with t.Parallel() by default — gate or serialize.
150-request loops + sleeps across many parallel tests will be slow and can fail due to external/provider variability. Prefer //go:build integration and/or remove t.Parallel() for these long-running scenarios.

Also applies to: 179-183, 341-345, 536-539, 652-656, 787-791, 940-944, 1082-1086


1209-1214: Deletion cascade: rate-limits shouldn’t be “acceptable orphaning” unless that’s the explicit contract.
Budgets are strictly asserted as cascade-deleted, but rate limits are allowed to remain with only a log. Make this consistent (either fail if present, or document why rate-limits are intentionally orphaned).

@@
- if vkRateLimitStillExists || pcRateLimitStillExists {
-     t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
- }
+ if vkRateLimitStillExists || pcRateLimitStillExists {
+     t.Fatalf("Rate limits should be cascade-deleted: VK=%v PC=%v",
+         vkRateLimitStillExists, pcRateLimitStillExists)
+ }

1224-1303: “Deletes budget” tests don’t assert the budget is actually removed post-delete.
You capture budget_id pre-delete but post-delete you only check the team/customer map. Also assert budgetsMap2[budgetID] is absent after deletion.

Also applies to: 1305-1384

plugins/governance/e2e_test.go (1)

809-839: Avoid t.Fatalf from goroutines (currently possible via MakeRequest).
MakeRequest calls t.Fatalf, which is unsafe when invoked from spawned goroutines. Prefer returning errors from the goroutine and failing once in the main goroutine after wg.Wait(), or provide a non-fatal request helper for concurrent tests.

plugins/governance/test_utils.go (2)

90-141: Add t.Helper() + an HTTP timeout to prevent hangs (and improve failure attribution).

@@
 import (
 	"bytes"
+	"os"
 	"encoding/json"
 	"fmt"
 	"io"
-	"math/rand"
 	"net/http"
 	"strings"
 	"testing"
 	"time"
 )
@@
 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
-	url := fmt.Sprintf("http://localhost:8080%s", req.Path)
+	t.Helper()
+	client := &http.Client{Timeout: 30 * time.Second}
+	baseURL := os.Getenv("BIFROST_TEST_BASE_URL")
+	if baseURL == "" {
+		baseURL = "http://localhost:8080"
+	}
+	url := fmt.Sprintf("%s%s", baseURL, req.Path)
@@
-	var responseBody map[string]interface{}
+	responseBody := map[string]interface{}{}
 	if len(rawBody) > 0 {
 		err = json.Unmarshal(rawBody, &responseBody)
 		if err != nil {
 			// If unmarshaling fails, store the raw response
 			responseBody = map[string]interface{}{"raw": string(rawBody)}
 		}
 	}

247-268: ExtractIDFromResponse ignores keyPath (misleading) — implement traversal or drop the parameter.

@@
 func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {
 		t.Fatalf("Request failed with status %d: %v", resp.StatusCode, resp.Body)
 	}
 
-	// Navigate through the response to find the ID
-	data := resp.Body
-	parts := []string{"virtual_key", "team", "customer"}
-	for _, part := range parts {
-		if val, ok := data[part]; ok {
-			if nested, ok := val.(map[string]interface{}); ok {
-				if id, ok := nested["id"].(string); ok {
-					return id
-				}
-			}
-		}
-	}
+	if keyPath != "" {
+		var cur any = resp.Body
+		for _, p := range strings.Split(keyPath, ".") {
+			m, ok := cur.(map[string]interface{})
+			if !ok {
+				t.Fatalf("Could not extract %q (non-object at %q): %v", keyPath, p, resp.Body)
+			}
+			cur = m[p]
+		}
+		if id, ok := cur.(string); ok && id != "" {
+			return id
+		}
+	}
+
+	for _, wrapper := range []string{"virtual_key", "team", "customer"} {
+		if nested, ok := resp.Body[wrapper].(map[string]interface{}); ok {
+			if id, ok := nested["id"].(string); ok && id != "" {
+				return id
+			}
+		}
+	}
 
 	t.Fatalf("Could not extract ID from response: %v", resp.Body)
 	return ""
 }
🧹 Nitpick comments (7)
plugins/governance/usage_tracking_test.go (1)

510-513: Same fragile string conversion pattern.

Same issue as Line 393 - use fmt.Sprintf or strconv.Itoa instead of string(rune('0'+i)).

-					Content: "Persistence test request " + string(rune('0'+i)) + ".",
+					Content: fmt.Sprintf("Persistence test request %d.", i),
plugins/governance/tracker_test.go (1)

76-81: Assertion assert.True(t, true) provides no verification.

This assertion always passes and doesn't verify anything about the function's behavior. Consider checking the logger for expected log messages or verifying the store state wasn't corrupted.

-	// Just verify it doesn't crash
-	assert.True(t, true)
+	// Verify store wasn't corrupted by the missing VK update
+	data := store.GetGovernanceData()
+	assert.NotNil(t, data, "Store should remain valid after update with missing VK")
plugins/governance/rate_limit_enforcement_test.go (1)

70-126: shouldStop pattern may cause early exit before observing rate-limit rejection.

Similar to the issue in edge_cases_test.go, the shouldStop logic (Lines 70, 119-126) sets the flag when the threshold is reached and exits on the next iteration. However, for token-based limits where the exceeding request is allowed post-hoc, this could exit before observing the actual rejection.

The current test logic handles this better than the budget test because it checks for rejection inside the loop (Lines 88-104) before incrementing, but the shouldStop check after the increment (Lines 119-126) is still unnecessary and potentially confusing.

-	shouldStop := false
 
 	for requestNum <= 20 {
 		// ... request logic that returns on rejection ...
 
 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedTokens >= tokenLimit {
-			shouldStop = true
-		}
 	}
plugins/governance/rate_limit_test.go (2)

1-7: Gate these as integration tests (real HTTP + provider dependency), and align across the Graphite stack (#1020/#1049).
Right now these will hard-fail/hang in unit CI if localhost:8080 isn’t running; consider //go:build integration or an env gate at test start (and keep it consistent across the stack).


14-28: Make hitting token limits deterministic (current limits are likely too high for the request payload).
With small prompts and ~10 requests, 300–500 tokens/hour may never be exceeded. Consider either (a) drastically lowering limits (e.g., 1–10 tokens) or (b) forcing token usage via max_tokens and a longer prompt, and/or asserting based on returned usage rather than loop count.

Also applies to: 160-179

plugins/governance/e2e_test.go (1)

1-8: These are integration/e2e tests (localhost:8080 + provider behavior) — ensure they’re not in default unit CI.
Recommend //go:build integration (or an env gate) for this file and the whole governance e2e suite.

plugins/governance/test_utils.go (1)

143-152: generateRandomID shouldn’t reseed global math/rand per call (collision/race risk under t.Parallel).
Prefer crypto/rand for test resource IDs.

@@
-import (
+import (
 	"bytes"
+	crand "crypto/rand"
 	"encoding/json"
 	"fmt"
 	"io"
-	"math/rand"
 	"net/http"
 	"strings"
 	"testing"
 	"time"
 )
@@
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
-	for i := range b {
-		b[i] = letters[rand.Intn(len(letters))]
-	}
+	if _, err := crand.Read(b); err != nil {
+		panic(fmt.Errorf("generateRandomID: %w", err))
+	}
+	for i := range b {
+		b[i] = letters[int(b[i])%len(letters)]
+	}
 	return string(b)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06fbe4a and 4325d29.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/changelog.md
  • plugins/governance/store_test.go
  • plugins/governance/go.mod
  • plugins/governance/fixtures_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/team_budget_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/e2e_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/team_budget_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/e2e_test.go
🧬 Code graph analysis (6)
plugins/governance/vk_budget_test.go (1)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/customer_budget_test.go (1)
plugins/governance/test_utils.go (7)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • CreateTeamRequest (183-187)
plugins/governance/config_update_sync_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
plugins/governance/tracker_test.go (7)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (64-83)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (705-712)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
plugins/governance/tracker.go (2)
  • NewUsageTracker (52-67)
  • UsageUpdate (17-30)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
core/utils.go (1)
  • IsFinalChunk (198-213)
plugins/governance/rate_limit_enforcement_test.go (1)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
plugins/governance/e2e_test.go (1)
plugins/governance/test_utils.go (14)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateCustomerRequest (190-193)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
  • CreateRateLimitRequest (202-207)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (2)
plugins/governance/resolver_test.go (2)

1-37: Well-structured test suite with good coverage.

The test file demonstrates thorough coverage of the BudgetResolver.EvaluateRequest workflow including happy paths, error conditions, and edge cases. The use of helper builders and assertion functions keeps tests readable and maintainable.


402-510: Table-driven tests for provider and model filtering are well designed.

The TestBudgetResolver_IsProviderAllowed and TestBudgetResolver_IsModelAllowed functions use idiomatic table-driven test patterns, covering key scenarios including empty allowlists and specific filtering rules.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (23)
plugins/governance/team_budget_test.go (1)

66-156: Remove shouldStop flag to correctly test budget enforcement.

The shouldStop logic (lines 70, 149-155) causes the loop to exit before sending the request that would trigger server-side budget rejection. The loop increments requestNum, sets shouldStop = true when budget threshold is reached, then breaks on the next iteration—before actually sending an over-budget request.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/resolver_test.go (1)

331-345: Avoid directly manipulating internal store state.

Line 337 bypasses encapsulation by directly calling store.budgets.Store(), coupling the test to implementation details. Consider adding a test helper method to LocalGovernanceStore for setting budget usage in tests.

-	// Get the governance data to update the budget directly
-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	// Use a test helper to set budget usage
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)
plugins/governance/customer_budget_test.go (2)

66-156: Remove shouldStop flag to ensure customer budget enforcement is tested.

Same issue as team budget tests: the shouldStop logic (lines 70, 149-155) causes the loop to exit before sending the request that would trigger server-side budget rejection.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= customerBudget {
-			shouldStop = true
-		}
 	}

241-331: Same shouldStop issue in TestCustomerBudgetExceededWithMultipleTeams.

Lines 245 and 324-330 have the same early-exit problem. Apply the same fix to remove the shouldStop logic.

plugins/governance/edge_cases_test.go (2)

59-79: VK budget is more restrictive than provider budget - test expectation is incorrect.

The test comment on Line 97 states "Provider budget should be the limiting factor (most restrictive)", but:

  • VK budget: $0.01 (Line 60)
  • Provider budget: $0.08 (Line 76)

The VK-level budget ($0.01) is actually more restrictive. Either swap the budget values or update the test expectation.

-	// Create a VK with even tighter budget and provider-specific budgets
-	vkBudget := 0.01
+	// Create a VK with moderate budget (provider budget is tighter)
+	vkBudget := 0.15

Or make provider budget genuinely the most restrictive:

 				Budget: &BudgetRequest{
-					MaxLimit:      0.08, // Even tighter provider budget
+					MaxLimit:      0.005, // Most restrictive - provider budget

97-180: Remove shouldStop flag - same pattern issue as other budget tests.

The shouldStop logic (lines 101, 173-179) causes early loop exit before the budget-exceeded request is sent.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
-		}
 	}
plugins/governance/usage_tracking_test.go (3)

250-340: Test creates VK without rate limit but checks for rate_limit_id.

TestInMemoryUsageUpdateOnRequest creates a VK with no rate limit (lines 258-264), then checks for rate_limit_id at line 332. This will always be empty, making the conditional branch meaningless. Either add a rate limit to the VK or verify budget usage instead.


391-394: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. For i >= 10, this produces unexpected characters.

-			Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+			Content: fmt.Sprintf("Test request %d for reset ticker test.", i),

Note: This requires adding "fmt" to the imports. The same issue exists at line 512.


8-129: Add real assertions for rate-limit reset verification.

TestUsageTrackingRateLimitReset doesn't verify that token usage actually drops after the 30s reset. It only sleeps and logs at line 128. The test should read the rate limit data before and after reset and assert that token_current_usage decreased.

+	// Get rate limit usage before reset
+	rateLimitsMap1 := data1["rate_limits"].(map[string]interface{})
+	rlData1 := rateLimitsMap1[rateLimitID].(map[string]interface{})
+	usageBeforeReset, _ := rlData1["token_current_usage"].(float64)
+	t.Logf("Token usage before reset: %d", int(usageBeforeReset))

 	// Wait for more than 30 seconds for the rate limit to reset
 	t.Logf("Waiting 35 seconds for rate limit ticker to reset...")
 	time.Sleep(35 * time.Second)

 	// Get rate limit data after reset
 	getDataResp3 := MakeRequest(t, APIRequest{...})

-	// Verify rate limit has been reset (usage should be 0 or close to it)
-	t.Logf("Rate limit reset should have occurred after 30s timeout ✓")
+	data3 := getDataResp3.Body["data"].(map[string]interface{})
+	rateLimitsMap3 := data3["rate_limits"].(map[string]interface{})
+	rlData3 := rateLimitsMap3[rateLimitID].(map[string]interface{})
+	usageAfterReset, _ := rlData3["token_current_usage"].(float64)
+
+	if usageBeforeReset > 0 && usageAfterReset >= usageBeforeReset {
+		t.Fatalf("Rate limit not reset: before=%d, after=%d", int(usageBeforeReset), int(usageAfterReset))
+	}
+	t.Logf("Rate limit reset correctly after 30s timeout ✓")
plugins/governance/store_test.go (1)

144-153: Don't ignore NewLocalGovernanceStore errors in the single-budget test.

The error from NewLocalGovernanceStore is still being discarded on line 149. If store construction fails, the test will proceed with a nil/invalid store and may panic or produce misleading results.

-		testStore, _ := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
+		testStore, err := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
 			VirtualKeys: []configstoreTables.TableVirtualKey{*testVK},
 			Budgets:     []configstoreTables.TableBudget{*testBudget},
 		})
+		require.NoError(t, err)
plugins/governance/in_memory_sync_test.go (1)

368-398: Potential double-delete on cleanup.

The VK is added to testData at line 369, then manually deleted at lines 391-398. When testData.Cleanup(t) runs via defer, it may attempt to delete the already-deleted VK, potentially causing test noise or failures.

Either remove from testData after manual deletion, or skip adding it since manual deletion is the test's purpose:

 	vkID := ExtractIDFromResponse(t, createVKResp, "id")
-	testData.AddVirtualKey(vkID)
+	// Don't add to testData since we manually delete as part of the test
plugins/governance/tracker_test.go (1)

154-155: Mismatched comment and function name.

The comment says TestUsageTracker_UpdateBudgetHierarchy but the function is named TestUsageTracker_Cleanup.

-// TestUsageTracker_UpdateBudgetHierarchy tests multi-level budget updates
+// TestUsageTracker_Cleanup tests cleanup behavior
 func TestUsageTracker_Cleanup(t *testing.T) {
plugins/governance/rate_limit_test.go (4)

8-79: Test can pass without verifying rate limit enforcement.

If the loop ends without seeing a rate-limit rejection, the test only logs success or skips (lines 74-78). This doesn't verify that rate limits are actually enforced - it could pass even if rate limiting is completely broken.

Consider failing the test if no rate-limit rejection was observed:

 	if successCount > 0 {
-		t.Logf("Made %d successful requests before hitting token limit ✓", successCount)
-	} else {
-		t.Skip("Could not make requests to test token limit")
+		t.Fatalf("Made %d requests without hitting token limit (limit: %d) - rate limit not enforced", successCount, tokenLimit)
 	}
+	t.Skip("Could not make requests to test token limit")

81-152: Same issue: test can pass without enforcement verification.

Lines 147-151 follow the same pattern where the test logs success even if no rate limit rejection occurred.


154-231: Same issue: test can pass without enforcement verification.

Lines 226-230 follow the same pattern where the test logs success even if no provider token limit rejection occurred.


233-310: Same issue: test can pass without enforcement verification.

Lines 305-309 follow the same pattern where the test logs success even if no provider request limit rejection occurred.

plugins/governance/rate_limit_enforcement_test.go (3)

132-217: Strengthen the rate-limit rejection assertion.

Lines 205-213 accept any 4xx/5xx as valid, then only check for keywords if status is exactly 429. A 500 error would pass the test without verifying it's actually a rate-limit rejection.

Consider requiring 429 status or failing on unexpected error codes:

 	if resp2.StatusCode < 400 {
 		t.Fatalf("Second request should be rejected (429) but got status %d", resp2.StatusCode)
 	}
 
-	if resp2.StatusCode == 429 {
-		if !CheckErrorMessage(t, resp2, "request") && !CheckErrorMessage(t, resp2, "rate") {
-			t.Fatalf("Rate limit error message missing expected keywords: %v", resp2.Body)
-		}
+	if resp2.StatusCode != 429 {
+		if !CheckErrorMessage(t, resp2, "request") && !CheckErrorMessage(t, resp2, "rate") {
+			t.Fatalf("Expected 429 or rate-limit error, got status %d: %v", resp2.StatusCode, resp2.Body)
+		}
 	}

489-489: Fragile string conversion for request numbering.

string(rune('0'+i)) only works correctly for i in range 0-9. For i >= 10, this produces unexpected Unicode characters.

-				Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+				Content: fmt.Sprintf("Request %d to test both limits.", i+1),

Note: This requires adding "fmt" to the imports.


602-611: Incomplete verification of usage tracking.

The test extracts rate_limit_id but doesn't verify that usage counters were actually incremented after the request. The assertion only confirms the rate limit exists.

Consider verifying actual usage values from the rate_limits map:

+	rateLimitsMap := data["rate_limits"].(map[string]interface{})
 	if rateLimitID != "" {
-		t.Logf("Rate limit %s is configured and tracking usage ✓", rateLimitID)
+		rateLimitData := rateLimitsMap[rateLimitID].(map[string]interface{})
+		tokenUsage, _ := rateLimitData["token_current_usage"].(float64)
+		if tokenUsage < float64(tokensUsed) {
+			t.Logf("Warning: Token usage not fully tracked: expected >= %d, got %.0f", tokensUsed, tokenUsage)
+		} else {
+			t.Logf("Rate limit %s tracked %d tokens ✓", rateLimitID, int(tokenUsage))
+		}
 	} else {
 		t.Logf("Rate limit is configured ✓")
 	}
plugins/governance/advanced_scenarios_test.go (2)

14-172: Heavy/slow tests running in parallel may cause CI flakiness.

These advanced scenario tests run 150-request loops with t.Parallel() across many tests. This can be slow and may fail due to provider variability or DB locking. Consider gating behind an integration build tag or environment variable.

Options:

  1. Add a build tag: //go:build integration at the top of the file
  2. Add a runtime check: if os.Getenv("RUN_INTEGRATION_TESTS") != "1" { t.Skip("skipping integration test") }
  3. Remove t.Parallel() for these long-running scenarios

1209-1214: Inconsistent cascade deletion assertion for rate limits vs budgets.

Budgets are now correctly asserted to be cascade-deleted (lines 1204-1207), but rate limits still get permissive treatment with just a log message (lines 1212-1214). This is inconsistent - if budgets must be deleted, rate limits should follow the same policy.

If cascade deletion should apply to rate limits:

 	// Rate limits should be gone
 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}

If orphaning rate limits is intentional while deleting budgets, add a comment explaining the design rationale.

plugins/governance/test_utils.go (2)

91-141: Add t.Helper() and HTTP client timeout to MakeRequest.

The function is missing t.Helper() which causes test failure stack traces to point here instead of the calling test. Additionally, the http.Client has no timeout, risking test hangs if the server is unresponsive.

Apply this diff:

 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+	t.Helper()
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)

247-268: ExtractIDFromResponse still ignores the keyPath parameter.

The keyPath parameter is declared but never used. The function only checks hardcoded wrapper keys. This creates a misleading API where callers might expect keyPath to work.

Either implement keyPath traversal or remove the unused parameter:

-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {
 		t.Fatalf("Request failed with status %d: %v", resp.StatusCode, resp.Body)
 	}

Note: If removing keyPath, update all callers accordingly.

🧹 Nitpick comments (5)
plugins/governance/tracker_test.go (1)

56-82: Consider adding a more meaningful assertion.

The test verifies that updating usage for a non-existent VK doesn't panic, but the assertion assert.True(t, true) is a no-op placeholder. While the test does verify the code doesn't crash, consider checking for expected log messages or verifying no state changes occurred.

plugins/governance/test_utils.go (3)

143-152: Remove deprecated rand.Seed call.

rand.Seed is deprecated since Go 1.20. The global random source is now automatically seeded, and calling Seed manually is unnecessary. Additionally, calling Seed on every invocation can cause collisions if called multiple times within the same nanosecond.

 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

270-292: Add t.Helper() for better test failure attribution.

Test helper functions that call t.Fatalf should include t.Helper() so that failure stack traces point to the calling test rather than inside this utility function.

 func CheckErrorMessage(t *testing.T, resp *APIResponse, expectedText string) bool {
+	t.Helper()
 	if resp.StatusCode < 400 {

330-367: Add t.Helper() and consider non-fatal error handling during cleanup.

The Cleanup method should call t.Helper(). Additionally, since MakeRequest can call t.Fatalf on network errors, cleanup from a defer could fail fatally before all resources are deleted. Consider using a non-fatal approach for cleanup.

 func (g *GlobalTestData) Cleanup(t *testing.T) {
+	t.Helper()
 	// Delete virtual keys
 	for _, vkID := range g.VirtualKeys {
plugins/governance/e2e_test.go (1)

1032-1140: Team deletion cascade test could benefit from clearer expected behavior.

The test creates and deletes a team with an associated VK but doesn't assert specific post-deletion behavior for the VK. Consider documenting the expected cascade behavior (nullification vs. deletion vs. blocking) and adding appropriate assertions.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06fbe4a and 4325d29.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/governance/fixtures_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/go.mod
  • plugins/governance/provider_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/changelog.md
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/team_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/store_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_enforcement_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/team_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/store_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_enforcement_test.go
🧬 Code graph analysis (10)
plugins/governance/resolver_test.go (4)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
core/schemas/context.go (1)
  • BifrostContext (30-40)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (101-101)
plugins/governance/rate_limit_test.go (2)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
plugins/governance/usage_tracking_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
plugins/governance/store_test.go (4)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (64-83)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (705-712)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (12)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/advanced_scenarios_test.go (3)
plugins/governance/test_utils.go (15)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatMessage (242-245)
  • CalculateCost (64-73)
  • UpdateVirtualKeyRequest (210-218)
  • CreateCustomerRequest (190-193)
  • UpdateBudgetRequest (196-199)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
  • ProviderConfigRequest (167-174)
  • CreateRateLimitRequest (202-207)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/providers/gemini/types.go (1)
  • Role (15-15)
plugins/governance/customer_budget_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateTeamRequest (183-187)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (10)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/e2e_test.go (4)
plugins/governance/test_utils.go (15)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateCustomerRequest (190-193)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
  • CreateRateLimitRequest (202-207)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (15-15)
  • Content (975-983)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (35)
plugins/governance/resolver_test.go (2)

1-37: Well-structured happy path test.

Good coverage of the basic allowed request scenario with proper setup, assertion helpers, and cleanup.


402-510: Well-designed table-driven tests for filtering logic.

Good coverage of provider and model allowlist scenarios including edge cases (empty allowlist, no configs). The table-driven approach makes these tests readable and maintainable.

plugins/governance/usage_tracking_test.go (2)

131-248: Good budget reset test with proper assertions.

This test correctly verifies that budget usage resets after the timeout by comparing usageAfterReset against a threshold (line 243-245).


449-451: Reset verification logic is now correct.

The assertion properly checks usageAfterReset >= usageBeforeReset to ensure the budget actually decreased after reset.

plugins/governance/store_test.go (4)

1-16: LGTM!

Clean imports and package declaration. The test file properly imports required dependencies from the bifrost framework and testify assertion libraries.


17-101: LGTM!

TestGovernanceStore_GetVirtualKey and TestGovernanceStore_ConcurrentReads are well-structured:

  • Table-driven tests for VK retrieval cover found/not-found cases
  • Concurrent reads test properly uses sync.WaitGroup and atomic counters to verify thread-safety

206-244: LGTM!

The comment mismatch from the previous review has been addressed. The test correctly validates both token and request usage updates through the rate limit system.


246-343: LGTM!

Reset tests and utility functions are well-implemented:

  • TestGovernanceStore_ResetExpiredRateLimits properly sets up expired rate limits and verifies reset behavior
  • TestGovernanceStore_ResetExpiredBudgets follows the same solid pattern
  • TestGovernanceStore_GetAllBudgets validates budget retrieval
  • ptrInt64 helper is clean and necessary for pointer initialization
plugins/governance/in_memory_sync_test.go (4)

1-6: LGTM!

Clean package declaration with minimal required imports.


8-126: LGTM!

TestInMemorySyncVirtualKeyUpdate thoroughly validates in-memory synchronization:

  • Creates VK with budget, verifies in-memory presence
  • Updates budget and verifies sync propagation
  • Appropriate sleep delays for async updates

128-342: LGTM!

Team and Customer update sync tests follow the same robust pattern as the VK test, properly verifying budget updates propagate to the in-memory store.


421-527: LGTM!

TestDataEndpointConsistency properly validates that the governance data endpoint returns consistent data across VKs, teams, and customers. The [WARN] log for count mismatches due to parallel tests is an appropriate handling of potential race conditions.

plugins/governance/tracker_test.go (2)

1-13: LGTM!

Clean package declaration and imports for tracker tests.


84-152: LGTM!

TestUsageTracker_UpdateUsage_StreamingOptimization properly validates streaming behavior:

  • Non-final chunks update token usage but not request counts
  • Final chunks increment request counters
  • Correct assertions on rate limit state after each update
plugins/governance/rate_limit_test.go (4)

1-6: LGTM!

Clean package declaration with minimal imports.


312-454: LGTM!

TestMultipleProvidersSeparateRateLimits and TestProviderAndVKRateLimitTogether properly verify configuration without relying on enforcement loops - they validate the in-memory state which is more reliable.


456-663: LGTM!

TestRateLimitInMemorySync and TestRateLimitTokenAndRequestTogether properly test configuration and in-memory state synchronization. The combined limits test appropriately skips if requests fail rather than falsely passing.


665-979: LGTM!

TestRateLimitUsageTrackedInMemory and TestProviderLevelRateLimitUsageTracking properly verify that usage counters are tracked in the in-memory store. They use warnings for non-increased usage rather than false positives, which is appropriate given async update timing.

plugins/governance/rate_limit_enforcement_test.go (3)

1-6: LGTM!

Clean package declaration with minimal imports.


8-130: LGTM!

TestVirtualKeyTokenRateLimitEnforcement is well-designed:

  • Properly fails if rate limit is never hit (line 128-129)
  • Validates rejection is due to rate limit (line 90)
  • Verifies rejection happens after exceeding limit (lines 94-96)

509-517: LGTM!

The enforcement check has been properly addressed. The test now fails if all requests succeed without hitting the rate limit (lines 510-513).

plugins/governance/advanced_scenarios_test.go (4)

1-6: LGTM!

Clean package declaration with minimal imports.


1224-1303: LGTM!

TestTeamDeletionDeletesBudget properly verifies team deletion removes the team from memory. The budget deletion assertion was addressed in a previous review.


1305-1384: LGTM!

TestCustomerDeletionDeletesBudget follows the same solid pattern as the team deletion test.


1386-1588: LGTM!

TestTeamDeletionSetsVKTeamIDToNil and TestCustomerDeletionSetsVKCustomerIDToNil properly verify that deleting a team/customer nullifies the foreign key reference on associated VKs while preserving the VK itself.

plugins/governance/e2e_test.go (10)

18-181: Test logic is well-structured and correctly validates shared team budget enforcement.

The test properly verifies POST-HOC budget enforcement semantics where the request that exceeds the budget is allowed but subsequent requests are blocked. The loop has appropriate bounds and exit conditions.


191-352: Comprehensive test for budget hierarchy enforcement.

The test correctly validates that the most restrictive budget in the hierarchy (provider budget at $0.01) is enforced first, regardless of higher limits at VK, team, and customer levels.


360-495: Good verification that failed requests don't consume budget.

The test properly validates that 4xx/5xx responses don't affect budget consumption while successful requests do. The float comparison tolerance (0.0001) is appropriate for budget amounts.


503-640: Complete lifecycle test for VK activation state.

The test properly validates active → inactive → active transitions and verifies request blocking/allowing behavior at each state.


647-755: Rate limit reset boundary test is correct but slow.

The test properly validates rate limit reset behavior. Note that this test takes ~16 seconds due to the wait for reset. Consider if this should be tagged for slow test runs.


763-852: Concurrent request handling test is correctly implemented.

The test properly uses sync.WaitGroup and mutex for synchronization. Success counting without hard failures per goroutine is appropriate for testing concurrent behavior where some requests may fail due to timing.


860-1025: Budget reset state test properly validates lazy reset mechanism.

The test correctly documents and verifies the lazy budget reset behavior where reset happens on the next request after the duration expires. The usage threshold (0.01) appropriately accounts for variable request costs.


1147-1255: VK deletion cascade test properly verifies in-memory store cleanup.

The test correctly validates that deleted VKs are removed from the in-memory store. The soft logging for post-deletion request behavior is appropriate given potential SDK caching.


1354-1363: Provider detection logic relies on string matching.

The model string matching (strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o")) works for current test cases but may need updates if model naming conventions change. This is acceptable for test code but noted for maintainability.


1390-1487: Provider fallback test correctly validates routing when primary provider doesn't support the model.

The test properly configures Anthropic with Claude models only, then requests gpt-4o to verify fallback to OpenAI. The success rate check appropriately handles potential transient failures.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (19)
plugins/governance/team_budget_test.go (1)

66-156: Remove shouldStop flag to ensure budget enforcement is actually tested.

The shouldStop logic (lines 70, 149-155) causes the loop to exit before sending an over-budget request. When consumedBudget >= teamBudget, shouldStop is set, and on the next iteration the loop breaks—but the over-budget request is never sent to verify enforcement.

Apply this diff:

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/vk_budget_test.go (1)

46-127: Remove shouldStop flag - same pattern issue as team_budget_test.go.

The shouldStop logic (lines 46, 120-126) causes early loop exit before the budget-exceeded request is sent. This prevents the test from observing the API's budget-enforcement error response.

Apply this diff:

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= vkBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/provider_budget_test.go (1)

63-64: Remove shouldStop flag from both subtests.

Both OpenAI (lines 63, 134-140) and Anthropic (lines 153, 224-230) subtests have the same early-exit issue that prevents observing budget enforcement.

Apply the fix to both subtests by removing:

  • var shouldStop = false declaration
  • if shouldStop { break } check
  • if consumedBudget >= providerBudget { shouldStop = true } assignment

Also applies to: 134-140, 153-154, 224-230

plugins/governance/store_test.go (1)

149-153: Don't ignore NewLocalGovernanceStore error.

The error is discarded with testStore, _ := NewLocalGovernanceStore(...). If store construction fails, the test will proceed with a nil/invalid store and may panic or produce misleading results.

-			testStore, _ := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
+			testStore, err := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
 				VirtualKeys: []configstoreTables.TableVirtualKey{*testVK},
 				Budgets:     []configstoreTables.TableBudget{*testBudget},
 			})
+			require.NoError(t, err)
plugins/governance/edge_cases_test.go (2)

59-79: Budget hierarchy values contradict the test expectation.

The test comment on line 97 states "Provider budget should be the limiting factor (most restrictive)", but the actual values are:

  • VK budget: $0.01 (line 60) ← actually most restrictive
  • Provider budget: $0.08 (line 76)
  • Team budget: $0.12
  • Customer budget: $0.15

The VK budget will be exhausted first, not the provider budget. Either swap the values or update the expectation.

Option 1 - Make provider budget most restrictive:

-	vkBudget := 0.01
+	vkBudget := 0.15  // Moderate VK budget
 	// ...
 					Budget: &BudgetRequest{
-						MaxLimit:      0.08, // Even tighter provider budget
+						MaxLimit:      0.005, // Most restrictive - provider budget

Option 2 - Update expectation to match VK being most restrictive:

-	// Test: Provider budget should be the limiting factor (most restrictive)
+	// Test: VK budget should be the limiting factor (most restrictive at $0.01)

Also applies to: 97-97


101-101: Remove shouldStop flag - same pattern issue as other budget tests.

The shouldStop logic prevents observing the actual budget enforcement error.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
-		}
 	}

Also applies to: 173-179

plugins/governance/fixtures_test.go (1)

38-66: Format arguments are ignored in MockLogger methods.

The logging methods accept args ...interface{} but discard them, so formatted messages like logger.Error("failed to process %s", item) will only log "failed to process %s" without substitution. This can obscure debugging information in tests.

plugins/governance/in_memory_sync_test.go (1)

368-398: Potential double-delete on cleanup.

The VK is added to testData at line 369, then manually deleted at lines 391-398. When testData.Cleanup(t) runs via defer, it may attempt to delete the already-deleted VK, potentially causing test noise or failures.

Either remove the VK from testData after manual deletion, or don't add it since manual deletion is the test's purpose.

plugins/governance/tracker_test.go (1)

154-155: Mismatched comment and function name.

The comment says TestUsageTracker_UpdateBudgetHierarchy but the function is named TestUsageTracker_Cleanup.

plugins/governance/config_update_sync_test.go (2)

48-66: Check HTTP status codes before indexing into response bodies.

Several /api/governance/data GETs assume 200 and immediately index into the response body. If the server errors, subsequent assertions will panic and mask the real issue. This pattern appears throughout the file.


54-66: Unsafe type assertions may cause test panics.

Multiple chained type assertions without ok checks can cause panics if the response structure is unexpected. This is the same pattern noted in in_memory_sync_test.go and would benefit from a shared extraction helper.

plugins/governance/rate_limit_test.go (1)

74-79: Tests can pass without ever verifying rate limit enforcement.

If the loop completes without seeing a rate-limit rejection, the test either logs success or skips. This means a broken rate-limit implementation would not be caught. The test should fail if no rejection is observed after exhausting attempts.

 	if successCount > 0 {
-		t.Logf("Made %d successful requests before hitting token limit ✓", successCount)
+		t.Fatalf("Rate limit not enforced: made %d requests without hitting token limit of %d", successCount, tokenLimit)
 	} else {
 		t.Skip("Could not make requests to test token limit")
 	}

This pattern applies to TestVirtualKeyRequestRateLimit, TestProviderConfigTokenRateLimit, and TestProviderConfigRequestRateLimit as well.

plugins/governance/rate_limit_enforcement_test.go (3)

489-489: Fragile string conversion for request numbering.

string(rune('0'+i)) only works correctly for i in range 0-9. Use fmt.Sprintf or strconv.Itoa instead.

-					Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+					Content: fmt.Sprintf("Request %d to test both limits.", i+1),

205-217: Test accepts any 4xx/5xx as valid rate-limit rejection.

The test passes if resp2.StatusCode >= 400 (Line 205), but rate-limiting should specifically return 429. If the second request fails with 500 or another error, this test would incorrectly pass. The error message validation at Lines 209-213 only runs when status is exactly 429, but doesn't fail otherwise.

-	if resp2.StatusCode < 400 {
-		t.Fatalf("Second request should be rejected (429) but got status %d", resp2.StatusCode)
-	}
-
-	if resp2.StatusCode == 429 {
-		if !CheckErrorMessage(t, resp2, "request") && !CheckErrorMessage(t, resp2, "rate") {
-			t.Fatalf("Rate limit error message missing expected keywords: %v", resp2.Body)
-		}
+	if resp2.StatusCode != 429 {
+		t.Fatalf("Second request should be rejected with 429 but got status %d: %v", resp2.StatusCode, resp2.Body)
 	}
+
+	if !CheckErrorMessage(t, resp2, "request") && !CheckErrorMessage(t, resp2, "rate") {
+		t.Fatalf("Rate limit error message missing expected keywords: %v", resp2.Body)
+	}

602-611: Test verifies rate limit exists but not that usage was tracked.

The test extracts rate_limit_id but doesn't verify that usage counters (current_token_usage) were incremented after the request. The test name suggests usage tracking verification but the assertion only confirms the rate limit configuration exists.

Consider verifying actual usage:

rateLimitsMap := data["rate_limits"].(map[string]interface{})
if rateLimitID != "" {
    rateLimitData := rateLimitsMap[rateLimitID].(map[string]interface{})
    tokenUsage, _ := rateLimitData["current_token_usage"].(float64)
    if tokenUsage < float64(tokensUsed) {
        t.Fatalf("Token usage not tracked: expected >= %d, got %.0f", tokensUsed, tokenUsage)
    }
    t.Logf("Rate limit %s tracked %d tokens ✓", rateLimitID, int(tokenUsage))
}
plugins/governance/advanced_scenarios_test.go (2)

14-171: Consider gating heavy exhaustion tests behind a build tag.

This test (and similar ones below) runs up to 150 requests in a loop with t.Parallel(). When multiple such tests run concurrently, this can cause CI timeouts or flakiness. Consider:

  1. Adding //go:build integration build tag, or
  2. Checking an environment variable like os.Getenv("INTEGRATION_TESTS") != "1" to skip in regular CI runs

1209-1214: Inconsistent cascade deletion assertion for rate limits vs budgets.

Budgets are correctly asserted to be cascade-deleted (Lines 1204-1207), but rate limits only log a note if they still exist (Lines 1212-1214). This is inconsistent - if cascade deletion should remove budgets, it should likely remove rate limits too.

 	// Rate limits should be gone
 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}

If orphaning rate limits is intentional (different from budgets), add a comment explaining the design decision.

plugins/governance/test_utils.go (2)

247-268: ExtractIDFromResponse ignores the keyPath parameter.

The keyPath parameter is never used - the function only checks hardcoded wrappers (virtual_key, team, customer). This is a misleading API.

Either implement keyPath traversal or remove the parameter:

-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse, _ string) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {
 		t.Fatalf("Request failed with status %d: %v", resp.StatusCode, resp.Body)
 	}

Or implement actual keyPath support for flexibility.


91-92: Add HTTP client timeout and t.Helper() to prevent test hangs.

The bare http.Client{} has no timeout, which can cause tests to hang indefinitely if the server is unresponsive. Other HTTP clients in the codebase use timeouts.

 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+	t.Helper()
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
🧹 Nitpick comments (5)
plugins/governance/in_memory_sync_test.go (2)

51-64: Consider defensive type assertions to avoid test panics.

Multiple chained type assertions without ok checks (e.g., lines 51-52, 60-61) can cause panics if the response structure is unexpected due to server errors. A shared extraction helper would make tests more robust:

func extractMapFromResponse(t *testing.T, resp *APIResponse, keys ...string) map[string]interface{} {
    t.Helper()
    current := resp.Body
    for _, key := range keys {
        val, ok := current[key].(map[string]interface{})
        if !ok {
            t.Fatalf("Failed to extract %q from response: %v", key, current)
        }
        current = val
    }
    return current
}

This pattern appears throughout this file and other test files in the PR.


520-524: Clarify the flaky assertion with a TODO or follow-up.

The warning log instead of failure at lines 521-524 acknowledges SQLite locking issues during parallel tests. Consider adding a TODO comment to address this properly, or use a retry/polling mechanism for more reliable assertions.

 	if int(vksFromEndpoint) != vkCount {
-		// Can fail because sqlite db might get locked because of all parallel tests
-		t.Logf("[WARN]VK count mismatch between /data endpoint and /virtual-keys endpoint: %d vs %d (this can happen because of parallel tests)", vkCount, int(vksFromEndpoint))
+		// TODO: Consider using polling/retry or a non-parallel test mode to make this assertion reliable
+		// Can fail because SQLite DB might get locked due to parallel tests
+		t.Logf("[WARN] VK count mismatch between /data endpoint and /virtual-keys endpoint: %d vs %d (can happen due to parallel tests)", vkCount, int(vksFromEndpoint))
 	}
plugins/governance/tracker_test.go (1)

76-82: No-op assertion provides no test value.

assert.True(t, true) at line 81 always passes and doesn't verify anything meaningful. The comment "Just verify it doesn't crash" is the actual intent, but this assertion is unnecessary.

 	// Should not panic or error
 	tracker.UpdateUsage(context.Background(), update)

 	time.Sleep(100 * time.Millisecond)
-	// Just verify it doesn't crash
-	assert.True(t, true)
+	// If we reach here without panic, the test passes
plugins/governance/rate_limit_test.go (1)

312-378: Configuration-only test - consider adding enforcement verification.

This test verifies that multiple provider configs with separate rate limits can be created and stored, but doesn't verify that the limits are actually enforced independently. Consider adding a follow-up test that exhausts one provider's limit and verifies the other provider remains usable.

plugins/governance/test_utils.go (1)

144-152: rand.Seed is deprecated since Go 1.20.

The global rand functions are automatically seeded since Go 1.20. Remove the deprecated rand.Seed call.

 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06fbe4a and 4325d29.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go (1 hunks)
  • plugins/governance/changelog.md (1 hunks)
  • plugins/governance/config_update_sync_test.go (1 hunks)
  • plugins/governance/customer_budget_test.go (1 hunks)
  • plugins/governance/e2e_test.go (1 hunks)
  • plugins/governance/edge_cases_test.go (1 hunks)
  • plugins/governance/fixtures_test.go (1 hunks)
  • plugins/governance/go.mod (3 hunks)
  • plugins/governance/in_memory_sync_test.go (1 hunks)
  • plugins/governance/provider_budget_test.go (1 hunks)
  • plugins/governance/rate_limit_enforcement_test.go (1 hunks)
  • plugins/governance/rate_limit_test.go (1 hunks)
  • plugins/governance/resolver_test.go (1 hunks)
  • plugins/governance/store_test.go (1 hunks)
  • plugins/governance/team_budget_test.go (1 hunks)
  • plugins/governance/test_utils.go (1 hunks)
  • plugins/governance/tracker_test.go (1 hunks)
  • plugins/governance/usage_tracking_test.go (1 hunks)
  • plugins/governance/vk_budget_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/governance/go.mod
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/changelog.md
  • plugins/governance/e2e_test.go
  • plugins/governance/resolver_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/store_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/test_utils.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/store_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/test_utils.go
🧬 Code graph analysis (7)
plugins/governance/in_memory_sync_test.go (2)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
plugins/governance/rate_limit_test.go (3)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/tracker_test.go (6)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (64-83)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
plugins/governance/tracker.go (2)
  • NewUsageTracker (52-67)
  • UsageUpdate (17-30)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
core/utils.go (1)
  • IsFinalChunk (198-213)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/fixtures_test.go (3)
core/schemas/logger.go (2)
  • LogLevel (6-6)
  • LoggerOutputType (17-17)
ui/lib/types/governance.ts (3)
  • Budget (5-11)
  • RateLimit (13-25)
  • VirtualKey (61-80)
plugins/governance/resolver.go (1)
  • Decision (15-15)
plugins/governance/config_update_sync_test.go (4)
plugins/governance/test_utils.go (16)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (15-15)
  • Content (975-983)
plugins/governance/store_test.go (4)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (64-83)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (705-712)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (25)
plugins/governance/team_budget_test.go (1)

8-65: Test setup and resource management look good.

The test properly:

  • Uses t.Parallel() for concurrent execution
  • Sets up cleanup with defer testData.Cleanup(t)
  • Creates team with budget and 2 VKs with higher individual budgets
  • Validates response status codes and extracts IDs
plugins/governance/vk_budget_test.go (1)

8-45: Test setup is well-structured.

Proper resource creation with cleanup, status validation, and ID extraction.

plugins/governance/provider_budget_test.go (1)

8-56: Test structure and provider-specific budget setup look good.

The test properly creates a VK with distinct provider budgets and runs subtests for each provider. The subtests correctly share the same VK to test provider-level budget isolation.

plugins/governance/store_test.go (4)

17-101: VK retrieval and concurrent read tests are well-designed.

The tests properly cover:

  • Active/inactive VK retrieval
  • Non-existent VK handling
  • Lock-free concurrent reads with 100 goroutines × 100 iterations

Good use of atomic counters for thread-safe counting.


165-204: Hierarchy validation test correctly exercises multi-level budgets.

The test builds a proper VK → Team → Customer hierarchy and validates that:

  1. All budgets under limit pass
  2. VK budget exceeding limit fails

The UpdateVirtualKeyInMemory call properly updates the in-memory state for the second assertion.


206-283: Rate limit and reset tests are thorough.

The tests correctly:

  • Update token/request usage atomically
  • Verify expired rate limits are reset to zero
  • Use appropriate timestamps for expired scenarios

285-338: Budget reset and retrieval tests look good.

Proper coverage for:

  • Expired budget reset (48h past)
  • Multiple budget retrieval by ID
plugins/governance/edge_cases_test.go (1)

8-58: Hierarchy setup is well-structured.

The test properly creates a Customer → Team → VK hierarchy with cleanup and ID tracking. Once the budget values are corrected, this will be a good test of hierarchical budget enforcement.

plugins/governance/fixtures_test.go (2)

70-193: Well-structured test data builders.

The builder functions provide good test scaffolding with consistent patterns, proper nil handling for optional fields, and clear composition (e.g., buildVirtualKeyWithBudget delegates to buildVirtualKey). This promotes DRY test code.


197-221: LGTM - assertion helpers are well-implemented.

The helpers correctly use t.Helper() for proper test failure line reporting. The assertion functions provide clear, focused checks for common test patterns.

plugins/governance/in_memory_sync_test.go (1)

9-126: LGTM - VK update sync test is comprehensive.

The test properly validates the full lifecycle: create VK with budget, verify in-memory state, update budget, and verify synchronization. The use of time.Sleep(500ms) for async updates is acceptable for integration tests.

plugins/governance/tracker_test.go (1)

84-152: LGTM - Streaming optimization test is well-structured.

The test properly validates streaming behavior: tokens accumulate on non-final chunks, and request counts increment only on final chunks. Clear assertions with meaningful checks.

plugins/governance/config_update_sync_test.go (2)

166-213: LGTM - Higher limit update branch now has proper assertions.

The test now verifies that the new higher limit is correctly reflected in memory (lines 201-205). The note at lines 207-210 appropriately acknowledges that usage remains 0 since it was reset by the previous lower-limit update.


545-709: LGTM - Team budget update sync test provides good hierarchical coverage.

The test properly creates a team with budget, then a VK under that team, consumes budget through the VK, updates the team budget to a lower value, and verifies the reset behavior. This validates the hierarchical budget propagation.

plugins/governance/rate_limit_test.go (2)

456-590: LGTM - Rate limit in-memory sync test is well-implemented.

The test properly verifies that rate limit updates sync to in-memory store, including checking that the rate limit ID persists, the max limit updates, and the reset duration is maintained. Good defensive checks with ok patterns at lines 558-580.


665-813: LGTM - Usage tracking test provides valuable verification.

The test validates that both token and request usage counters increase after making requests. The assertions properly check the state changes and verify limits are preserved.

plugins/governance/rate_limit_enforcement_test.go (2)

1-6: LGTM! Well-structured test file with proper imports.

The test file correctly imports testing and time packages needed for the rate limit enforcement tests.


11-130: LGTM! Comprehensive token rate limit enforcement test.

The test properly validates the post-hoc enforcement model where the request exceeding the limit is allowed but subsequent requests are blocked. Good assertions verify:

  • Rate limit rejection happens only after token limit is exceeded
  • Test fails if all 20 requests succeed without hitting limit
plugins/governance/advanced_scenarios_test.go (3)

1-6: LGTM! Well-structured test file with proper imports.


179-1075: Same pattern applies to all budget exhaustion tests.

All these tests follow the same 150-request loop pattern. The build tag recommendation applies to all of them.


1390-1588: LGTM! Tests properly verify entity linkage nullification.

The tests correctly verify that deleting a team/customer sets the associated VK's team_id/customer_id to nil while keeping the VK itself intact.

plugins/governance/test_utils.go (4)

15-73: LGTM! Well-defined cost model for test calculations.

The ModelCost struct and TestModels map provide clear pricing data for various models. CalculateCost correctly computes costs based on input/output tokens.


154-245: LGTM! Well-structured request/response types.

The types are properly defined with appropriate JSON tags and pointer types for optional fields.


270-297: LGTM! Error checking utilities with proper case-insensitive matching.

The contains helper now correctly implements case-insensitive comparison using strings.ToLower.


299-367: LGTM! Well-designed test data management with proper cleanup.

The GlobalTestData type provides clean resource tracking with appropriate cleanup order (VKs → Teams → Customers) and 404 tolerance for already-deleted resources.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from 4325d29 to 4830360 Compare December 15, 2025 06:39
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (47)
plugins/governance/go.mod (1)

24-25: Duplicate dependency entry still present.

The github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 dependency appears twice. Run go mod tidy to clean up the module file.

cd plugins/governance && go mod tidy
plugins/governance/test_utils.go (5)

91-92: Add t.Helper() and HTTP client timeout to prevent test hangs.

MakeRequest is missing t.Helper() (failures report wrong line numbers) and creates a bare http.Client with no timeout, which can cause tests to hang indefinitely.

🔎 Suggested fix
 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+	t.Helper()
-	client := &http.Client{}
+	client := &http.Client{Timeout: 30 * time.Second}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)

144-145: Remove deprecated rand.Seed call.

rand.Seed is deprecated since Go 1.20 and the global random source is automatically seeded. Since go.mod specifies Go 1.25.5, this call should be removed. Additionally, calling Seed on every invocation in parallel tests can cause race conditions.

🔎 Suggested fix
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"

299-328: GlobalTestData is not thread-safe for parallel tests.

Since tests use t.Parallel(), multiple goroutines may call AddVirtualKey, AddTeam, or AddCustomer concurrently. Appending to slices without synchronization is a data race.

🔎 Suggested fix
+import "sync"

 type GlobalTestData struct {
+	mu          sync.Mutex
 	VirtualKeys []string
 	Teams       []string
 	Customers   []string
 }

 func (g *GlobalTestData) AddVirtualKey(id string) {
+	g.mu.Lock()
+	defer g.mu.Unlock()
 	g.VirtualKeys = append(g.VirtualKeys, id)
 }

 func (g *GlobalTestData) AddTeam(id string) {
+	g.mu.Lock()
+	defer g.mu.Unlock()
 	g.Teams = append(g.Teams, id)
 }

 func (g *GlobalTestData) AddCustomer(id string) {
+	g.mu.Lock()
+	defer g.mu.Unlock()
 	g.Customers = append(g.Customers, id)
 }

330-331: Cleanup should also be protected by mutex and call t.Helper().

If Cleanup runs while another goroutine is adding IDs, there's a race.

🔎 Suggested fix
 func (g *GlobalTestData) Cleanup(t *testing.T) {
+	t.Helper()
+	g.mu.Lock()
+	defer g.mu.Unlock()
+
 	// Delete virtual keys

247-268: ExtractIDFromResponse ignores keyPath parameter.

The function accepts keyPath but never uses it, only checking hardcoded wrappers. This creates a misleading API.

plugins/governance/advanced_scenarios_test.go (1)

1230-1235: Inconsistent cascade deletion assertion for rate limits vs budgets.

Budgets now correctly use t.Fatalf (lines 1225-1228), but rate limits still use t.Logf (lines 1233-1234), which is inconsistent. Based on learnings, budgets and rate limits have 1:1 relationships with parent entities, so rate limits should also be cascade-deleted.

🔎 Suggested fix
 	// Rate limits should be gone
 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}
plugins/governance/edge_cases_test.go (2)

59-79: Budget hierarchy doesn't match test expectations.

The test comment states "Provider budget should be the limiting factor (most restrictive)", but:

  • VK budget: $0.01 (line 60)
  • Provider budget: $0.08 (line 76)

The VK-level budget ($0.01) is actually more restrictive than the provider-level budget ($0.08). The VK budget will be exhausted first, not the provider budget. Either swap the values or update the test expectation.

🔎 Suggested fix - make provider budget most restrictive
-	vkBudget := 0.01
+	vkBudget := 0.15  // Higher than provider budget
 	createVKResp := MakeRequest(t, APIRequest{
 		// ...
 		Budget: &BudgetRequest{
 			MaxLimit:      vkBudget,
 			// ...
 		},
 		ProviderConfigs: []ProviderConfigRequest{
 			{
 				Provider: "openai",
 				Weight:   1.0,
 				Budget: &BudgetRequest{
-					MaxLimit:      0.08, // Even tighter provider budget
+					MaxLimit:      0.005, // Most restrictive
 					ResetDuration: "1h",
 				},
 			},
 		},

101-101: Remove shouldStop flag - causes early loop exit before rejection is observed.

The shouldStop pattern (lines 101, 173-179) causes the loop to exit right after crossing the budget threshold, potentially skipping the request that would trigger the budget-exceeded rejection response. This undermines the test's ability to validate enforcement.

🔎 Suggested fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
-		}
 	}

Also applies to: 173-179

plugins/governance/vk_budget_test.go (1)

46-127: Remove shouldStop flag to ensure budget enforcement is actually tested.

The shouldStop logic causes the loop to exit before sending a request that would trigger the server-side budget rejection. When consumedBudget >= vkBudget, shouldStop is set, and on the next iteration the loop breaks—but the over-budget request is never sent to verify enforcement.

Remove the client-side early exit and let the API's enforcement (or the 50-request cap) terminate the loop:

Proposed fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= vkBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/team_budget_test.go (1)

70-156: Remove shouldStop flag to ensure team budget enforcement is actually tested.

Same issue as in vk_budget_test.go: the shouldStop flag causes the loop to exit before sending an over-budget request, so the test never verifies that the API actually rejects requests when the team budget is exceeded.

Proposed fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/provider_budget_test.go (2)

63-141: Remove shouldStop flag in OpenAI subtest.

Same issue as other budget tests: the shouldStop flag causes the loop to exit before sending an over-budget request. Remove it to actually verify provider budget enforcement.

Proposed fix
 		consumedBudget := 0.0
 		requestNum := 1
 		var lastSuccessfulCost float64
-		var shouldStop = false

 		for requestNum <= 50 {
 			// ... request logic ...

 			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
 		}

153-231: Remove shouldStop flag in Anthropic subtest.

Same issue as the OpenAI subtest and other budget tests.

plugins/governance/resolver_test.go (1)

332-346: Direct manipulation of store.budgets bypasses encapsulation.

Line 338 directly accesses store.budgets.Store(), coupling the test to implementation details. Consider adding a test helper method to the store for setting budget usage in tests.

Proposed approach
-	// Get the governance data to update the budget directly
-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	// Use a test helper to set budget usage
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)
plugins/governance/usage_tracking_test.go (3)

112-127: Rate limit reset test lacks assertion on actual reset.

The test waits 35 seconds for the rate limit to reset but only logs that reset "should have occurred" without verifying the values. Consider asserting that token_current_usage actually decreased to ~0.

Proposed fix
// After getting getDataResp3, verify reset
virtualKeysMap3 := getDataResp3.Body["virtual_keys"].(map[string]interface{})
vkData3 := virtualKeysMap3[vkValue].(map[string]interface{})

getRateLimitsResp := MakeRequest(t, APIRequest{
    Method: "GET",
    Path:   "/api/governance/rate-limits?from_memory=true",
})
rateLimitsMap := getRateLimitsResp.Body["rate_limits"].(map[string]interface{})
if rl, ok := rateLimitsMap[rateLimitID].(map[string]interface{}); ok {
    tokenUsage, _ := rl["token_current_usage"].(float64)
    if tokenUsage > 0 {
        t.Fatalf("Rate limit not reset: token_current_usage is %.0f (expected 0)", tokenUsage)
    }
}
t.Logf("Rate limit correctly reset to 0 ✓")

399-399: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. Use strconv.Itoa(i) instead.

-					Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+					Content: "Test request " + strconv.Itoa(i) + " for reset ticker test.",

Add "strconv" to the imports.


522-522: Same fragile string conversion issue.

-					Content: "Persistence test request " + string(rune('0'+i)) + ".",
+					Content: "Persistence test request " + strconv.Itoa(i) + ".",
plugins/governance/in_memory_sync_test.go (6)

99-104: Missing status code check before accessing budgets response body.

This issue was previously flagged. Add a status check before accessing getBudgetsResp2.Body["budgets"] to prevent panics on non-200 responses.


387-392: Missing status code check before accessing response body.

Previously flagged - add status check for getDataResp before accessing the body.


416-421: Missing status code check after VK deletion.

Previously flagged - add status check for getDataResp2 before accessing the body.


380-381: Potential double-delete on cleanup.

Previously flagged - the VK is added to testData at line 381, then manually deleted at lines 402-409. When testData.Cleanup(t) runs via defer, it may attempt to delete the already-deleted VK.


213-218: Missing status code check for budgets endpoint.

Same pattern as in TestInMemorySyncVirtualKeyUpdate - the code accesses getBudgetsResp2.Body["budgets"] without verifying the response status first.

 	getBudgetsResp2 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})
+	if getBudgetsResp2.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d, body: %v", getBudgetsResp2.StatusCode, getBudgetsResp2.Body)
+	}
 
 	budgetsMap2 := getBudgetsResp2.Body["budgets"].(map[string]interface{})

325-330: Missing status code check for budgets endpoint.

Same pattern - add status verification before accessing the response body.

 	getBudgetsResp2 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})
+	if getBudgetsResp2.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d, body: %v", getBudgetsResp2.StatusCode, getBudgetsResp2.Body)
+	}
 
 	budgetsMap2 := getBudgetsResp2.Body["budgets"].(map[string]interface{})
plugins/governance/store_test.go (5)

284-286: Unchecked exists return from GetVirtualKey.

The second return value is discarded. Add verification to prevent nil pointer access if the VK doesn't exist.

-	updatedVK, _ := store.GetVirtualKey("sk-bf-test")
+	updatedVK, exists := store.GetVirtualKey("sk-bf-test")
+	require.True(t, exists, "VK should exist after reset")
 	require.NotNil(t, updatedVK)

319-321: Unchecked exists return from GetVirtualKey.

Same pattern - add exists check for consistency and safety.

-	updatedVK, _ := store.GetVirtualKey("sk-bf-test")
+	updatedVK, exists := store.GetVirtualKey("sk-bf-test")
+	require.True(t, exists, "VK should exist after budget reset")
 	require.NotNil(t, updatedVK)

116-116: Unchecked exists return from GetVirtualKey.

Previously flagged - verify the exists return before using vk.


192-192: Unchecked exists return from GetVirtualKey.

Previously flagged (marked as addressed) but still present in code - verify the exists return.


149-154: Ignoring error and exists returns from store operations.

Both NewLocalGovernanceStore and GetVirtualKey return values that are being discarded. If store construction fails or the VK doesn't exist, the test will proceed with invalid state and produce misleading results.

Suggested fix
-			testStore, _ := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
+			testStore, err := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
 				VirtualKeys: []configstoreTables.TableVirtualKey{*testVK},
 				Budgets:     []configstoreTables.TableBudget{*testBudget},
 			})
+			require.NoError(t, err)
 
-			testVK, _ = testStore.GetVirtualKey("sk-bf-test")
+			testVK, exists := testStore.GetVirtualKey("sk-bf-test")
+			require.True(t, exists, "VK should exist")
 			err := testStore.CheckBudget(context.Background(), testVK, &EvaluationRequest{Provider: schemas.OpenAI}, nil)
plugins/governance/customer_budget_test.go (2)

67-156: shouldStop early-exit pattern may prevent budget enforcement verification.

Previously flagged - the shouldStop flag (line 70) causes the loop to exit (lines 149-155) before sending the request that would trigger server-side budget rejection. The test may pass without actually observing the enforcement error.


242-331: Same shouldStop early-exit issue in second test function.

Previously flagged - TestCustomerBudgetExceededWithMultipleTeams has the same early-exit problem at lines 245, 324-330.

plugins/governance/config_update_sync_test.go (2)

54-66: Unsafe chained type assertions may cause test panics.

Previously flagged - multiple chained type assertions without ok checks can cause panics if response structure is unexpected. This pattern repeats throughout the file.


48-53: Missing HTTP status code check before indexing into response body.

Previously flagged - several GET endpoints assume 200 status; if server errors, subsequent assertions will panic.

plugins/governance/rate_limit_enforcement_test.go (4)

67-125: shouldStop early-exit pattern still present.

Previously flagged as addressed, but the pattern remains in TestVirtualKeyTokenRateLimitEnforcement. The shouldStop flag (line 69) and early break (lines 118-124) could cause the test to exit before verifying rate limit enforcement if the flag is set on the same iteration that exceeds the limit.


278-336: Same shouldStop early-exit pattern in provider token test.

TestProviderConfigTokenRateLimitEnforcement has the same pattern at lines 280, 329-335.


489-491: Fragile string conversion for request numbering.

Previously flagged - string(rune('0'+i)) only works correctly for i in range 0-9. For i >= 10, this produces incorrect characters.

-					Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+					Content: fmt.Sprintf("Request %d to test both limits.", i+1),

603-614: Incomplete verification of usage tracking.

Previously flagged - the test extracts rate_limit_id but doesn't verify that usage counters were actually incremented after the request. The test name suggests tracking verification, but the assertion only confirms the rate limit exists.

Consider fetching the rate limit data and asserting token_current_usage >= tokensUsed.

plugins/governance/e2e_test.go (7)

100-148: shouldStop flag can cause early loop exit before budget rejection is observed.

The shouldStop flag (lines 100, 141-147) causes the loop to exit after setting shouldStop = true when consumedBudget >= teamBudget, but before the next request that would actually be rejected. This means the test may exit without ever observing the budget enforcement rejection.

🔎 Proposed fix
 	consumedBudget := 0.0
 	requestNum := 1
-	shouldStop := false

 	for requestNum <= 150 { // Need many requests since each costs ~$0.0001
 		resp := MakeRequest(t, APIRequest{
 			// ... request logic ...
 		})

 		// ... success/error handling ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}

70-71: Unchecked type assertions may panic on unexpected response structure.

Direct type assertions like vk1["value"].(string) will panic if the key is missing or value is not a string, causing unclear test failures.

🔎 Proposed fix
-	vk1 := createVK1Resp.Body["virtual_key"].(map[string]interface{})
-	vk1Value := vk1["value"].(string)
+	vk1, ok := createVK1Resp.Body["virtual_key"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Response missing virtual_key object")
+	}
+	vk1Value, ok := vk1["value"].(string)
+	if !ok {
+		t.Fatalf("VK missing value field")
+	}

Also applies to: 88-92


291-351: Same shouldStop pattern issue in TestFullBudgetHierarchyEnforcement.

Lines 291, 344-350 have the same early-exit problem that can cause the loop to terminate before observing budget rejection.

🔎 Proposed fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	shouldStop := false

 	for requestNum <= 20 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= providerBudget {
-			shouldStop = true
-		}
 	}

302-302: Fragile string conversion for request numbering.

string(rune('0'+requestNum%10)) only produces correct digit characters for values 0-9. Use fmt.Sprintf or strconv.Itoa for reliable conversion.

🔎 Proposed fix
-					Content: "Test hierarchy enforcement request " + string(rune('0'+requestNum%10)),
+					Content: fmt.Sprintf("Test hierarchy enforcement request %d", requestNum),

Add "fmt" to imports if not already present.


815-842: Goroutines calling MakeRequest risk deadlock on test failure.

MakeRequest calls t.Fatalf on errors, which in a goroutine will exit without calling defer wg.Done(), causing wg.Wait() to deadlock. This was flagged in previous reviews but remains unaddressed.

🔎 Suggested approach
var errors []error
var errorsMu sync.Mutex

for i := 0; i < numGoroutines; i++ {
    wg.Add(1)
    go func(goID int) {
        defer wg.Done()
        defer func() {
            if r := recover(); r != nil {
                errorsMu.Lock()
                errors = append(errors, fmt.Errorf("goroutine %d panicked: %v", goID, r))
                errorsMu.Unlock()
            }
        }()
        // ... existing logic using a non-fatal request helper ...
    }(i)
}
wg.Wait()

if len(errors) > 0 {
    t.Fatalf("Concurrent test failed with errors: %v", errors)
}

1167-1186: Cascade test only logs VK status without asserting expected behavior.

The test logs "VK request after team deletion: status=%d" but doesn't assert whether the VK should work or be rejected. A cascade test should verify deterministic behavior.

Define and assert the expected cascade behavior: either the VK should be rejected (preferred for referential integrity) or explicitly document that it continues working without team budget.


1397-1407: Provider detection logic remains fragile.

Line 1400 checks strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o"), which misclassifies "azure/gpt-4o" as OpenAI since it contains "gpt-4o".

🔎 Proposed fix
 		if model, ok := resp.Body["model"].(string); ok {
-			if strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o") {
+			parts := strings.SplitN(model, "/", 2)
+			provider := parts[0]
+			if provider == "openai" {
 				openaiCount++
 				t.Logf("Request %d routed to OpenAI (model: %s)", i+1, model)
-			} else if strings.Contains(model, "azure") {
+			} else if provider == "azure" {
 				azureCount++
 				t.Logf("Request %d routed to Azure (model: %s)", i+1, model)
 			}
 		}
plugins/governance/rate_limit_test.go (4)

74-78: Test can pass without verifying rate limit enforcement.

If all 10 requests succeed (e.g., token limit is high enough), the test passes with just a log message. This doesn't verify that rate limiting actually works.

🔎 Proposed fix
-	if successCount > 0 {
-		t.Logf("Made %d successful requests before hitting token limit ✓", successCount)
-	} else {
-		t.Skip("Could not make requests to test token limit")
-	}
+	// If all requests succeeded, the test didn't verify enforcement
+	t.Fatalf("Rate limit not enforced: made %d requests without hitting token limit (limit: %d tokens)", successCount, tokenLimit)

The test should use a limit low enough to guarantee hitting it within the loop, or fail if enforcement was never observed.


147-151: Same issue: TestVirtualKeyRequestRateLimit can pass without enforcement.

With a 3-request limit and 5 attempts, the test should hit the limit. However, if it doesn't (due to timing or other issues), it passes silently.

Apply the same fix: fail if all requests succeed without observing rate limit rejection.


226-230: Provider rate limit tests have the same weak assertion pattern.

TestProviderConfigTokenRateLimit (lines 226-230) and TestProviderConfigRequestRateLimit (lines 305-309) both allow passing without verifying enforcement.

Also applies to: 305-309


54-54: Fragile string conversion at multiple locations.

string(rune('0'+i)) only works for i in range 0-9. This pattern appears at lines 54, 127, 206, 285, and 643.

🔎 Proposed fix
-				Content: "Short test request " + string(rune('0'+i)) + " for token limit.",
+				Content: "Short test request " + strconv.Itoa(i) + " for token limit.",

Add "strconv" to imports and apply to all occurrences.

🧹 Nitpick comments (10)
plugins/governance/advanced_scenarios_test.go (1)

14-15: Consider gating heavy tests behind a build tag for CI stability.

Tests with 150-request loops combined with t.Parallel() can be slow and flaky in CI due to provider variability and resource contention. Consider adding a build tag like //go:build integration or an environment check to run these selectively.

🔎 Example: Environment-based gating
func TestVKSwitchTeamAfterBudgetExhaustion(t *testing.T) {
	if os.Getenv("RUN_INTEGRATION_TESTS") != "1" {
		t.Skip("Skipping integration test; set RUN_INTEGRATION_TESTS=1 to run")
	}
	t.Parallel()
	// ...
}
plugins/governance/provider_budget_test.go (1)

57-235: Consider extracting shared budget-exhaustion loop into a helper.

Both OpenAIProviderBudgetExceeded and AnthropicProviderBudgetExceeded subtests have nearly identical structure—only the model name and cost calculation differ. A helper function would reduce ~160 lines of duplication.

func runProviderBudgetTest(t *testing.T, vkValue string, model string, providerBudget float64) {
    // shared loop logic
}
plugins/governance/usage_tracking_test.go (1)

257-346: Test name suggests usage tracking but doesn't verify meaningful tracking.

TestInMemoryUsageUpdateOnRequest creates a VK without rate limit or budget configuration. The test only verifies the VK exists in memory but doesn't assert any actual usage tracking occurred. Consider either:

  1. Adding a rate limit or budget to the VK, then asserting usage values increased after the request
  2. Renaming the test to reflect what it actually verifies
plugins/governance/in_memory_sync_test.go (1)

547-551: Inconsistency check logs warning but test continues without failure.

The count mismatch between /data endpoint and /virtual-keys endpoint is logged as a warning but the test still passes. If consistency is the goal of this test, consider whether this should be a failure or at minimum document why the inconsistency is acceptable.

if int(vksFromEndpoint) != vkCount {
    // Can fail because sqlite db might get locked because of all parallel tests
    t.Logf("[WARN]VK count mismatch between /data endpoint and /virtual-keys endpoint: %d vs %d (this can happen because of parallel tests)", vkCount, int(vksFromEndpoint))
}

The comment explains the reason, but if this is expected behavior, consider renaming the test or adjusting assertions to reflect what is actually being verified.

plugins/governance/e2e_test.go (2)

1502-1515: Fallback test assertions are weak—only logs without verifying provider.

Similar to the weighted LB test, this test logs "likely via OpenAI fallback" but doesn't verify the response actually came from OpenAI. Parse the model field to confirm requests were handled by the fallback provider.

🔎 Proposed improvement
 		if resp.StatusCode == 200 {
 			successCount++

-			// Try to detect which provider actually handled it
-			model := ""
-			if m, ok := resp.Body["model"].(string); ok {
-				model = m
+			if model, ok := resp.Body["model"].(string); ok {
+				parts := strings.SplitN(model, "/", 2)
+				if len(parts) > 0 && parts[0] == "openai" {
+					t.Logf("Request %d correctly routed to OpenAI fallback (model: %s)", i+1, model)
+				} else if strings.Contains(model, "anthropic") || strings.Contains(model, "claude") {
+					t.Fatalf("Request %d unexpectedly handled by Anthropic (model: %s)", i+1, model)
+				}
+			} else {
+				t.Logf("Request %d succeeded but model field missing", i+1)
 			}
-
-			t.Logf("Request %d succeeded (model: %s) - likely via OpenAI fallback", i+1, model)
 		}

1293-1297: VK deletion test doesn't assert rejection of deleted VK.

Lines 1293-1297 only log whether the deleted VK was rejected or not, without asserting expected behavior. After deletion, the VK should be rejected—make this a hard assertion.

🔎 Proposed fix
 	if resp.StatusCode < 400 {
-		t.Logf("Deleted VK still accepts requests (status=%d) - may be cached in SDK", resp.StatusCode)
+		t.Fatalf("Deleted VK should be rejected but got status %d", resp.StatusCode)
-	} else {
-		t.Logf("Deleted VK request rejected (status=%d) ✓", resp.StatusCode)
 	}
+	t.Logf("Deleted VK request correctly rejected (status=%d) ✓", resp.StatusCode)
plugins/governance/rate_limit_test.go (4)

38-39: Unchecked type assertions throughout the file.

Multiple locations use direct type assertions like vk["value"].(string) without checking ok. This can cause panics with unclear error messages.

Consider adding a helper function or using comma-ok pattern consistently:

vkValue, ok := vk["value"].(string)
if !ok {
    t.Fatalf("VK response missing 'value' field")
}

Also applies to: 111-112, 190-191, 269-270


659-663: TestRateLimitTokenAndRequestTogether allows passing without enforcement.

Similar to other tests, this can pass if all 3 requests succeed without hitting any limit. With high limits (5000 tokens, 100 requests), enforcement won't be tested.

Either lower the limits to guarantee hitting them, or add explicit assertions that enforcement was observed.


790-794: Usage tracking test uses warnings instead of assertions.

Lines 790-794 and 802-806 log warnings when usage doesn't increase, but the test still passes. If usage tracking is the purpose of the test, these should be failures.

 	if tokenUsage2 <= initialTokenUsage {
-		t.Logf("Warning: Token usage did not increase (before: %d, after: %d)", int64(initialTokenUsage), int64(tokenUsage2))
+		t.Fatalf("Token usage did not increase after request: before=%d, after=%d", int64(initialTokenUsage), int64(tokenUsage2))
-	} else {
-		t.Logf("Token usage increased from %d to %d ✓", int64(initialTokenUsage), int64(tokenUsage2))
 	}
+	t.Logf("Token usage increased from %d to %d ✓", int64(initialTokenUsage), int64(tokenUsage2))

Also applies to: 802-806


1-6: Missing strconv import for proposed string conversion fixes.

When addressing the fragile string conversion issues, you'll need to add "strconv" to the imports.

 import (
+	"strconv"
 	"testing"
 	"time"
 )
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c49485 and b15b300.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/governance/tracker_test.go
  • plugins/governance/fixtures_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/changelog.md
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/go.mod
  • plugins/governance/resolver_test.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/usage_tracking_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/resolver_test.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/usage_tracking_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/resolver_test.go
🧬 Code graph analysis (12)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/customer_budget_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateTeamRequest (183-187)
plugins/governance/store_test.go (4)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
plugins/governance/provider_budget_test.go (1)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
plugins/governance/team_budget_test.go (1)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/rate_limit_enforcement_test.go (1)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
plugins/governance/advanced_scenarios_test.go (1)
plugins/governance/test_utils.go (3)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
plugins/governance/in_memory_sync_test.go (2)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
plugins/governance/rate_limit_test.go (2)
plugins/governance/test_utils.go (7)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ProviderConfigRequest (167-174)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
plugins/governance/config_update_sync_test.go (1)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
plugins/governance/vk_budget_test.go (1)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (12)
plugins/governance/changelog.md (1)

1-3: LGTM!

The changelog entries correctly document the refactoring and test additions. The previously flagged typos ("and interface" → "an interface" and "e2e ends" → "e2e tests") have been addressed.

plugins/governance/go.mod (1)

10-10: Testify dependency addition looks good.

The github.com/stretchr/testify v1.11.1 addition appropriately supports the new governance test suite.

plugins/governance/test_utils.go (2)

294-297: Case-insensitive matching is now correctly implemented.

The contains helper now properly uses strings.ToLower for case-insensitive comparison, matching its documentation.


369-424: Well-designed polling utilities with progressive backoff.

WaitForCondition and WaitForAPICondition provide useful async testing patterns with reasonable timeouts and backoff strategies.

plugins/governance/advanced_scenarios_test.go (1)

1-6: Comprehensive governance test coverage.

The test file provides thorough coverage of complex governance scenarios including:

  • VK switching between teams/customers after budget exhaustion
  • Hierarchical budget chain switching
  • Budget updates after exhaustion (VK, team, customer, provider config)
  • Cascade deletion verification for VKs, teams, and customers
  • Foreign key nullification on parent deletion

This aligns well with the PR's goal of adding governance tests.

plugins/governance/edge_cases_test.go (1)

10-13: Good test structure with proper cleanup.

The test correctly uses t.Parallel() and defer testData.Cleanup(t) for resource management.

plugins/governance/resolver_test.go (1)

1-551: Comprehensive test coverage for BudgetResolver.

The test file provides thorough coverage of the resolver's evaluation logic including VK states, provider/model filtering, rate limits, budgets, and context population. The table-driven tests for isProviderAllowed and isModelAllowed are well-structured.

plugins/governance/store_test.go (1)

1-15: Well-structured test suite with good coverage.

The test file covers key governance store operations including lock-free retrieval, concurrent reads, budget validation, rate limit updates, and reset behaviors. Good use of table-driven tests and parallel execution.

plugins/governance/customer_budget_test.go (1)

1-6: Good test coverage for customer-level budget enforcement.

The tests appropriately validate that customer budgets are enforced across multiple VKs and multiple teams, with proper hierarchy setup and cost tracking.

plugins/governance/config_update_sync_test.go (2)

1-6: Comprehensive config update sync test suite.

The tests cover VK rate limit, VK budget, provider rate limit, team budget, customer budget, and provider budget update synchronization to in-memory store. Good coverage of the reset behavior when new limits are lower than current usage.


686-702: Good use of WaitForCondition for async verification.

Unlike fixed time.Sleep delays used elsewhere, TestTeamBudgetUpdateSyncToMemory uses WaitForCondition helper to poll for the expected state. This is more robust and should be the preferred pattern across all async sync tests.

plugins/governance/rate_limit_enforcement_test.go (1)

8-129: Good documentation of POST-HOC enforcement semantics.

The comments clearly explain that rate limit enforcement is post-hoc (the exceeding request is allowed, subsequent ones are blocked). This is important context for understanding test behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (24)
plugins/governance/edge_cases_test.go (2)

97-180: Remove shouldStop and assert provider‑budget exhaustion based only on API responses.

The loop uses a local shouldStop flag and a consumedBudget >= 0.08 heuristic to break out. This:

  • Couples the test to approximate cost math rather than actual budget enforcement.
  • Can stop after a single extra request past the threshold even if the API still accepts traffic, instead of continuing until either a budget error appears or the iteration cap is hit.

You can simplify and make the test more robust by:

  • Dropping shouldStop entirely.
  • Letting the loop run until:
    • A 4xx response with a “budget”/“provider” message is observed (success), or
    • requestNum exceeds the cap (hard failure).

Optionally also assert that the observed failure clearly corresponds to the provider budget (e.g., by checking specific error wording or by asserting consumedBudget is within a tight window of the provider limit).


59-79: Provider budget vs VK budget mismatch – test description doesn’t match data.

VK budget is 0.01 while the provider budget is 0.08, but the comments and line 97 explicitly state that the provider budget is the “most restrictive” and “limiting factor.” With these values the VK budget will be tighter than the provider budget.

Either:

  • Make the provider budget genuinely tighter than the VK budget (e.g., VK=0.15, provider=0.005), or
  • Update the comments and assertions to reflect that the VK budget is the limiting budget in this scenario.

Also applies to: 94-99

plugins/governance/test_utils.go (3)

143-152: Remove per‑call rand.Seed in generateRandomID.

rand.Seed(time.Now().UnixNano()) on every call is deprecated in Go 1.20+ and can introduce data races when tests run in parallel, since it mutates the global RNG.

Given newer Go versions auto‑seed the global source, just drop the rand.Seed line and keep using rand.Intn.


247-268: Either use keyPath in ExtractIDFromResponse or remove it.

The keyPath parameter is currently ignored; the function only looks for hardcoded wrappers (virtual_key, team, customer). This is misleading for callers and makes the signature noisier than necessary.

Two reasonable options:

  • Implement simple keyPath traversal (e.g., split on "." and walk nested maps), or
  • Remove keyPath from the signature and all call sites, documenting that the helper only understands the standard wrappers.

Also consider adding t.Helper() inside this function for clearer failure locations.


90-141: Add t.Helper() and an HTTP timeout to MakeRequest to avoid hanging tests.

MakeRequest creates a bare http.Client with no timeout and doesn’t mark itself as a test helper. If the server on localhost:8080 is slow or unavailable, tests can hang indefinitely, and failures are reported from inside this helper instead of the actual test.

Suggested adjustments:

  • Call t.Helper() at the start of the function.
  • Construct the client with a reasonable timeout, e.g.:
client := &http.Client{
    Timeout: 30 * time.Second,
}
plugins/governance/resolver_test.go (1)

332-345: Avoid mutating store.budgets directly in tests; add a test helper instead.

This test reaches into store.budgets via GetGovernanceData() and store.budgets.Store(...), tightly coupling the test to internal implementation details and bypassing any invariants enforced by store methods.

Prefer a dedicated, test‑only helper on LocalGovernanceStore (e.g. SetBudgetCurrentUsageForTest(id, usage float64)) that:

  • Locates the budget by ID,
  • Updates CurrentUsage, and
  • Stores it back, returning an error if not found.

This keeps the test intent clear while allowing the store internals to evolve.

plugins/governance/vk_budget_test.go (1)

41-48: Simplify VK budget exhaustion loop by removing shouldStop.

Here the test already:

  • Treats any 4xx with a “budget” error as success and returns, and
  • Fails at the end if no such error was observed.

The additional shouldStop flag based on consumedBudget >= vkBudget adds complexity without improving correctness, and can prematurely cap attempts if pricing or token usage changes.

You can drop shouldStop and its checks, and rely solely on:

  • Observing a budget‑related 4xx response, or
  • Hitting the requestNum <= 50 cap and failing.

This keeps the loop’s termination conditions aligned strictly with API behavior.

Also applies to: 84-127

plugins/governance/provider_budget_test.go (1)

57-64: Remove shouldStop in provider budget tests; rely on actual budget/provider errors.

Both subtests:

  • Mark the test as passed only when a 4xx with “budget”/“provider” is seen, and
  • Otherwise fail at the end if no such error occurred.

The extra shouldStop flag based on consumedBudget >= providerBudget is unnecessary and can stop the loop after a single additional request, even if the API would only reject slightly later (e.g., due to pricing or token‑tracking changes).

You can safely:

  • Delete shouldStop and its checks, and
  • Let the loops run until they either see the expected error or exhaust the 50‑request cap.

This keeps the tests driven purely by server‑side budget enforcement.

Also applies to: 100-141, 147-154, 190-231

plugins/governance/store_test.go (1)

115-117: Don’t ignore NewLocalGovernanceStore errors or exists from GetVirtualKey in tests.

Across these tests you:

  • Discard the error from NewLocalGovernanceStore (e.g., testStore, _ := NewLocalGovernanceStore(...)).
  • Ignore the exists boolean from GetVirtualKey and proceed to use the returned pointer.

If store construction or VK lookup ever fails, the tests will continue with a nil/invalid store or VK, leading to panics or misleading failures.

Recommended pattern:

testStore, err := NewLocalGovernanceStore(...)
require.NoError(t, err)

vk, exists := testStore.GetVirtualKey("sk-bf-test")
require.True(t, exists, "VK should exist for this test scenario")
require.NotNil(t, vk)

Apply similar checks in the hierarchy test and the reset tests so failures surface immediately at setup time.

Also applies to: 144-155, 192-196, 283-290, 319-324

plugins/governance/advanced_scenarios_test.go (3)

88-133: Budget “exhaustion” scenarios don’t assert that a rejection was actually observed.

In all of these loops you:

  • Accumulate consumedBudget and/or rely on hitting a loop bound, and
  • Break on a budget error if it appears,

but several scenarios either:

  • Only check consumedBudget >= <limit> (Scenarios 1 & 2), or
  • Don’t check anything at all after the loop (Scenarios 3–7),

before proceeding to “switch” or “update” behavior. This means the tests can still pass when server-side budget enforcement is broken, as long as enough cost accumulates or the loop simply exits after 150 iterations.

Pattern to harden each test:

sawBudgetRejection := false

for requestNum <= 150 {
    // ... MakeRequest call ...

    if resp.StatusCode >= 400 {
        if CheckErrorMessage(t, resp, "budget") {
            sawBudgetRejection = true
            t.Logf("... budget exhausted at request %d (consumed: $%.6f)", requestNum, consumedBudget)
            break
        }
        t.Fatalf("Request %d failed with unexpected error: %v", requestNum, resp.Body)
    }

    // ... usage / cost accumulation ...
    requestNum++
}

if !sawBudgetRejection {
    t.Fatalf("Did not observe budget exhaustion before %s (consumed: $%.6f)",
        "<switch/update operation>", consumedBudget)
}

Without this, the “after exhaustion” behaviors you’re testing aren’t actually gated on having seen a budget-induced rejection.

Also applies to: 253-297, 455-491, 571-603, 702-738, 855-891, 990-1026


15-16: 150‑request t.Parallel() E2E tests are heavy and likely flaky; consider gating as integration tests.

Each of these scenarios runs up to 150 HTTP requests to /v1/chat/completions and uses t.Parallel(). Across 10+ such tests that’s thousands of requests per CI run, all depending on a live localhost server and external pricing behavior. This is slow and prone to intermittent failures (e.g., transient DB locks, rate limiting, server restarts).

Consider one or both of:

  • Removing t.Parallel() from these long‑running E2E tests, and/or
  • Gating them behind an explicit integration mechanism:
    • build tag like //go:build integration, or
    • runtime check such as if os.Getenv("INTEGRATION_TESTS") != "1" { t.Skip("integration-only scenario") }.

That keeps unit tests fast and reliable while still allowing rich scenarios to run in dedicated jobs.

Also applies to: 180-181, 342-343, 536-537, 652-653, 788-789, 941-942


1080-1235: Cascade deletion tests should assert budgets and rate limits are removed, and rely on testData for cleanup.

Given the stated invariants that budgets and rate limits have 1:1 relationships with their parent entities and should cascade-delete with them, these tests are currently too permissive:

  • TestVKDeletionCascadeComplete:

    • Correctly asserts that VK‑ and provider‑config budgets are removed from memory.
    • But only logs when rate limits remain (t.Logf("... rate limits may still exist ...")) instead of failing. With 1:1 rate‑limit ownership, leftover rate limits should be treated as a failure, not “acceptable.”
  • TestTeamDeletionDeletesBudget and TestCustomerDeletionDeletesBudget:

    • Capture budgetID before delete and verify it exists pre‑delete.
    • After deletion they only assert the team/customer is gone; they do not assert that budgetID is absent from the budgets map, even though the function names explicitly claim the budget should be deleted.

Additionally, these tests intentionally avoid registering some IDs in testData (“Don’t add to testData since we’ll delete manually”). If a test fails before the delete call, those resources will leak and won’t be cleaned up by Cleanup.

Recommended changes:

  • In TestVKDeletionCascadeComplete, treat leftover rate limits the same as budgets: t.Fatalf(...) if either VK or provider‑config rate limit still exists.
  • In the team/customer deletion tests, after the second GET to budgets, assert that budgetID no longer exists and fail if it does.
  • Register teams/customers/VKs with testData even when you delete them manually; Cleanup already tolerates 404s, so double‑deletes are safe and you avoid leaks on early failures.

Based on learnings, budgets and rate limits are 1:1 with their parents and should not remain orphaned after cascade deletions.

Also applies to: 1245-1328, 1330-1413

plugins/governance/in_memory_sync_test.go (1)

99-105: Add HTTP status checks before casting response bodies in in‑memory sync tests.

Several GET calls assume success and immediately cast Body fields to maps:

  • Budgets in TestInMemorySyncVirtualKeyUpdate, TestInMemorySyncTeamUpdate, and TestInMemorySyncCustomerUpdate.
  • Virtual keys in TestInMemorySyncVirtualKeyDelete (both the pre‑delete and post‑delete reads).

If any of these endpoints returns a non‑200 status (e.g., transient DB or server error), the tests will panic on the type assertion instead of failing with a clear message.

Mirror the pattern you already use elsewhere in this file:

resp := MakeRequest(t, APIRequest{ Method: "GET", Path: "..." })
if resp.StatusCode != 200 {
    t.Fatalf("Failed to get %s: status %d, body: %v", "<resource>", resp.StatusCode, resp.Body)
}
m := resp.Body["<key>"].(map[string]interface{})

This makes failures explicit and easier to diagnose.

Also applies to: 213-219, 325-331, 387-393, 416-422

plugins/governance/customer_budget_test.go (2)

70-156: Remove shouldStop flag to ensure customer budget enforcement is tested.

The shouldStop logic (lines 70, 149-155) causes the loop to exit before sending the request that would trigger server-side budget rejection. This means the test may pass without actually verifying enforcement.

Based on past review comments.


245-331: Same shouldStop issue in second test function.

TestCustomerBudgetExceededWithMultipleTeams has the same early-exit problem at lines 245, 324-330.

Based on past review comments.

plugins/governance/config_update_sync_test.go (2)

54-66: Unsafe type assertions may cause test panics.

Multiple chained type assertions without ok checks (lines 54-66) will panic if the response structure is unexpected, causing unclear test failures.

Consider adding defensive checks or using a helper function to safely extract nested values.

Based on past review comments.


48-53: Check HTTP status codes before indexing into response bodies.

The /api/governance/virtual-keys?from_memory=true GET at lines 49-52 assumes 200 status. If the server errors, subsequent type assertions will panic and mask the real issue.

This pattern appears throughout the file. Add status code checks before accessing response bodies.

Based on past review comments.

plugins/governance/rate_limit_enforcement_test.go (1)

490-490: Fragile string conversion for request numbering.

string(rune('0'+i)) only works correctly for i in range 0-9. If the loop runs more iterations, this will produce incorrect characters.

🔎 Suggested fix
-				Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+				Content: fmt.Sprintf("Request %d to test both limits.", i+1),

Add "fmt" or "strconv" to imports.

Based on past review comments.

plugins/governance/e2e_test.go (5)

70-71: Unchecked type assertions may panic.

The type assertions vk["value"].(string) will panic if the key is missing or the value is not a string, causing unclear test failures.

🔎 Suggested fix
-	vk1 := createVK1Resp.Body["virtual_key"].(map[string]interface{})
-	vk1Value := vk1["value"].(string)
+	vk1, ok := createVK1Resp.Body["virtual_key"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Response missing virtual_key object")
+	}
+	vk1Value, ok := vk1["value"].(string)
+	if !ok {
+		t.Fatalf("VK missing value field")
+	}

Also applies to lines 91-92 and similar patterns throughout.

Based on past review comments.


100-148: Remove shouldStop flag—can exit before observing budget enforcement rejection.

The shouldStop flag (lines 100, 141-147) causes early loop exit before the budget-exceeded request is sent, allowing the test to pass without verifying enforcement.

🔎 Suggested fix
 	consumedBudget := 0.0
 	requestNum := 1
-	shouldStop := false

 	for requestNum <= 150 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}

Based on past review comments.


302-302: Fragile string conversion.

string(rune('0'+requestNum%10)) only works correctly for single digits.

🔎 Suggested fix
-				Content: "Test hierarchy enforcement request " + string(rune('0'+requestNum%10)),
+				Content: fmt.Sprintf("Test hierarchy enforcement request %d", requestNum),

Based on past review comments.


815-842: Goroutine calling MakeRequest which uses t.Fatalf internally.

MakeRequest calls t.Fatalf on errors. If any error occurs in a goroutine, it will exit without calling defer wg.Done(), causing wg.Wait() to deadlock.

Suggested approach
var errors []error
var errorsMu sync.Mutex

for i := 0; i < numGoroutines; i++ {
    wg.Add(1)
    go func(goID int) {
        defer wg.Done()
        defer func() {
            if r := recover(); r != nil {
                errorsMu.Lock()
                errors = append(errors, fmt.Errorf("goroutine %d panicked: %v", goID, r))
                errorsMu.Unlock()
            }
        }()
        // ... existing logic ...
    }(i)
}
wg.Wait()

if len(errors) > 0 {
    t.Fatalf("Concurrent test failed with errors: %v", errors)
}

Based on past review comments.


1397-1407: Provider detection logic is incorrect.

Line 1400 checks strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o"), which means any model containing "gpt-4o" (including "azure/gpt-4o") will be counted as OpenAI, leading to incorrect Azure count.

🔎 Suggested fix
 		if model, ok := resp.Body["model"].(string); ok {
-			if strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o") {
+			provider := strings.Split(model, "/")[0]
+			if provider == "openai" {
 				openaiCount++
 				t.Logf("Request %d routed to OpenAI (model: %s)", i+1, model)
-			} else if strings.Contains(model, "azure") {
+			} else if provider == "azure" {
 				azureCount++
 				t.Logf("Request %d routed to Azure (model: %s)", i+1, model)
 			}
 		}

Based on past review comments.

plugins/governance/rate_limit_test.go (1)

54-54: Fragile string conversion for request numbering at multiple locations.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. This pattern appears at lines 54, 127, 206, 285, and 643.

🔎 Suggested fix
-				Content: "Short test request " + string(rune('0'+i)) + " for token limit.",
+				Content: "Short test request " + strconv.Itoa(i) + " for token limit.",

Add "strconv" to the imports. Apply the same fix to lines 127, 206, 285, and 643.

Based on past review comments.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c49485 and b15b300.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/governance/team_budget_test.go
  • plugins/governance/changelog.md
  • plugins/governance/go.mod
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/fixtures_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/provider_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/store_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/config_update_sync_test.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/provider_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/store_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/config_update_sync_test.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/provider_budget_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/store_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/config_update_sync_test.go
🧬 Code graph analysis (9)
plugins/governance/provider_budget_test.go (1)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ProviderConfigRequest (167-174)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/resolver_test.go (7)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
core/schemas/context.go (1)
  • BifrostContext (31-41)
ui/lib/types/governance.ts (5)
  • VirtualKey (61-80)
  • RateLimit (13-25)
  • Budget (5-11)
  • Customer (37-44)
  • Team (27-35)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (117-117)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/edge_cases_test.go (5)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/store_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
ui/lib/types/governance.ts (4)
  • Customer (37-44)
  • Team (27-35)
  • RateLimit (13-25)
  • Budget (5-11)
plugins/governance/vk_budget_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (11)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/rate_limit_test.go (2)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
plugins/governance/test_utils.go (1)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
plugins/governance/config_update_sync_test.go (2)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
  • WaitForCondition (371-394)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from b15b300 to cd49e9e Compare December 22, 2025 15:28
@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from graphite-base/1041 to 12-08-feat_governance_plugin_refactors December 22, 2025 15:28
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (46)
plugins/governance/fixtures_test.go (1)

38-66: Format arguments are ignored in logging methods.

The logging methods accept args ...interface{} but never use them for formatting. This was flagged in a previous review.

plugins/governance/resolver_test.go (1)

332-346: Direct store manipulation bypasses encapsulation.

This was flagged in a previous review. The test directly accesses store.budgets.Store() which couples the test to implementation details.

plugins/governance/team_budget_test.go (1)

66-156: Loop stop condition may prevent observing budget enforcement.

This was flagged in a previous review. The shouldStop flag is set when consumedBudget >= teamBudget and checked at the end of the loop, but this can cause the loop to exit before sending an over-budget request. The API should return the 4xx response, not the client stopping early.

plugins/governance/in_memory_sync_test.go (5)

99-104: Missing status code check for budgets endpoint.

This was flagged in a previous review. getBudgetsResp2.Body["budgets"] is accessed without verifying the response status, which could cause a panic if the request fails.


386-398: Missing status code check before accessing response body.

This was flagged in a previous review. The response body is accessed without first checking that getDataResp.StatusCode == 200.


416-421: Missing status code check after VK deletion.

This was flagged in a previous review. Same pattern - add a status check before accessing the body.


380-381: Potential double-delete on cleanup.

This was flagged in a previous review. The VK is added to testData at line 381, then manually deleted later. When testData.Cleanup(t) runs via defer, it may attempt to delete the already-deleted VK.


325-330: Missing status code check for budgets endpoint in customer update test.

Same pattern as above - the budgets response body is accessed without status verification.

Proposed fix
 	getBudgetsResp2 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})
+	if getBudgetsResp2.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d, body: %v", getBudgetsResp2.StatusCode, getBudgetsResp2.Body)
+	}
 
 	budgetsMap2 := getBudgetsResp2.Body["budgets"].(map[string]interface{})
plugins/governance/store_test.go (5)

116-116: Unchecked exists return from GetVirtualKey.

This was flagged in a previous review but marked as addressed. The current code still ignores the exists return value.


149-153: Don't ignore NewLocalGovernanceStore error.

This was flagged in a previous review. The error from store construction is discarded, which could cause misleading test failures if the store fails to initialize.


192-192: Unchecked exists return from GetVirtualKey.

This was flagged in a previous review but marked as addressed. The current code still ignores the exists return value.


284-286: Unchecked exists return from GetVirtualKey.

This was flagged in a previous review but marked as addressed. The current code still ignores the exists return value.


319-321: Unchecked exists return from GetVirtualKey.

This was flagged in a previous review. The exists return value is ignored, which could lead to nil pointer access if the VK doesn't exist.

plugins/governance/customer_budget_test.go (2)

70-156: shouldStop early-exit pattern prevents budget enforcement verification.

This issue was previously flagged. The shouldStop flag causes the loop to exit immediately after setting the flag (lines 149-155), before sending the additional request that would trigger the server-side budget rejection. The test may pass without actually observing budget enforcement.


245-331: Same shouldStop early-exit issue in second test function.

This issue was previously flagged. TestCustomerBudgetExceededWithMultipleTeams has the identical early-exit problem at lines 245, 324-330.

plugins/governance/advanced_scenarios_test.go (2)

1124-1126: Unsafe manual deletion pattern risks resource leaks.

The comment "Don't add to testData since we'll delete manually" means if the test fails or panics before reaching the deletion code, this VK will leak and pollute subsequent test runs. The Cleanup method handles 404 responses gracefully, so double-deletion is safe.

🔎 Proposed fix
 	vkID := ExtractIDFromResponse(t, createVKResp, "id")
-	// Don't add to testData since we'll delete manually
+	testData.AddVirtualKey(vkID) // Safe to add - Cleanup handles 404 gracefully

Apply the same fix at lines 1269-1270, 1354-1355, 1439-1440, and 1537-1538.


1230-1235: Inconsistent cascade deletion assertion for rate limits.

The test correctly asserts that budgets must be cascade-deleted (lines 1225-1228), but rate limits are allowed to be orphaned with just a log message (lines 1233-1235). This inconsistency is confusing - if cascade deletion should remove budgets, it should likely also remove rate limits. Based on learnings, budgets and rate limits have 1:1 relationships with parent entities.

🔎 Suggested fix
 	// Rate limits should be gone
 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}

If orphaning is intentional for rate limits but not budgets, add a comment explaining the design decision.

plugins/governance/usage_tracking_test.go (3)

166-178: Missing status checks before type assertions.

getVKResp and getBudgetsResp are used without checking StatusCode before type-asserting Body, which could panic on non-200 responses.

🔎 Proposed fix
 	getVKResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})

+	if getVKResp.StatusCode != 200 {
+		t.Fatalf("Failed to get virtual keys: status %d", getVKResp.StatusCode)
+	}

 	virtualKeysMap := getVKResp.Body["virtual_keys"].(map[string]interface{})

 	getBudgetsResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})

+	if getBudgetsResp.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d", getBudgetsResp.StatusCode)
+	}

 	budgetsMap := getBudgetsResp.Body["budgets"].(map[string]interface{})

521-523: Same fragile string conversion issue.

string(rune('0'+i)) at line 522 has the same issue.

🔎 Proposed fix
-					Content: "Persistence test request " + string(rune('0'+i)) + ".",
+					Content: "Persistence test request " + strconv.Itoa(i) + ".",

398-400: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9.

🔎 Proposed fix
-					Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+					Content: "Test request " + strconv.Itoa(i) + " for reset ticker test.",

Add "strconv" to the imports.

plugins/governance/rate_limit_enforcement_test.go (1)

489-491: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. For i >= 10, this produces unexpected Unicode characters.

🔎 Proposed fix
-					Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+					Content: "Request " + strconv.Itoa(i+1) + " to test both limits.",

Add "strconv" to the imports.

plugins/governance/config_update_sync_test.go (1)

49-66: Missing status check before chained type assertions.

Lines 54-66 perform chained type assertions on getVKResp1.Body and getRateLimitsResp1.Body without checking status codes first. If either request fails, this will panic.

🔎 Proposed fix
 	getVKResp1 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})

+	if getVKResp1.StatusCode != 200 {
+		t.Fatalf("Failed to get virtual keys: status %d", getVKResp1.StatusCode)
+	}

 	vkData1 := getVKResp1.Body["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})

Apply the same pattern to all GET requests throughout the file (lines 98-112, 151-165, 203-217, etc.).

plugins/governance/e2e_test.go (18)

70-71: Unchecked type assertions may panic.

The type assertions on lines 70-71 and 88-92 will panic if the response structure is unexpected, causing unclear test failures.

Also applies to: 88-92


100-148: Remove shouldStop flag—can exit before observing budget enforcement rejection.

The shouldStop flag (lines 100, 141-147) causes early loop exit before the budget-exceeded request is sent, as flagged in prior reviews.


277-278: Unchecked type assertions may panic.

Same pattern as line 70—will panic on unexpected response structure.


288-351: Remove shouldStop flag in TestFullBudgetHierarchyEnforcement.

Same pattern as line 100—can cause early exit before budget rejection is observed.


302-302: Fragile string conversion.

The expression string(rune('0'+requestNum%10)) only works for single digits and will produce incorrect results for requestNum >= 10. Use fmt.Sprintf instead.


390-391: Unchecked type assertions chain may panic.

Multiple chained type assertions without checking ok can cause panics if the response structure is unexpected, as previously flagged.

Also applies to: 401-414


533-534: Unchecked type assertions may panic.

Same pattern—will panic on unexpected response structure.


682-683: Unchecked type assertions may panic.

Same pattern—will panic on unexpected response structure.


798-799: Unchecked type assertions may panic.

Same pattern—will panic on unexpected response structure.


815-842: Goroutine calling MakeRequest which uses t.Fatalf internally.

MakeRequest calls t.Fatalf on errors. If any error occurs in a goroutine, it will exit without calling defer wg.Done(), causing wg.Wait() to deadlock. This issue was flagged in past reviews.


895-896: Unchecked type assertions chain may panic.

Multiple chained type assertions without checking ok, as previously flagged.

Also applies to: 906-920


1125-1126: Unchecked type assertions may panic.

Same pattern—will panic on unexpected response structure.


1167-1186: Strengthen assertions for team deletion cascade behavior.

The test logs the VK status after team deletion but doesn't assert expected behavior. Comment on line 1168 states "VK may or may not work depending on implementation," which suggests unclear requirements. A cascade test should verify deterministic behavior.


1227-1228: Unchecked type assertions may panic.

Same pattern—will panic on unexpected response structure.


1345-1346: Unchecked type assertions may panic.

Multiple unchecked type assertions that can panic on unexpected response structure.

Also applies to: 1356-1362


1399-1407: Provider detection logic is still incorrect.

Line 1400 checks strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o"), which means any model containing "gpt-4o" (including "azure/gpt-4o") will be counted as OpenAI. The Azure count will likely be incorrect.

Parse the provider prefix correctly by splitting on "/" and checking the first element.


1474-1475: Unchecked type assertions may panic.

Same pattern—will panic on unexpected response structure.


1486-1515: Add assertions to verify fallback mechanism with provider detection.

The test only counts successful responses and logs "likely via OpenAI fallback" without verifying which provider actually handled the requests. Extract and parse the model field from responses to detect which provider handled each request, as previously suggested.

plugins/governance/test_utils.go (6)

63-73: Make CalculateCost fail fast instead of returning an ignored error.

All call sites use cost, _ := CalculateCost(...), so an unknown model silently returns 0 and tests can appear to pass while budgets are never consumed. This was flagged in a prior review.


90-141: Add an HTTP client timeout and t.Helper() to MakeRequest.

The function creates a bare http.Client with no timeout, which can cause test hangs. Add a 30-second timeout and t.Helper() for proper test failure attribution, as previously recommended.


144-152: Remove deprecated rand.Seed call.

rand.Seed is deprecated since Go 1.20 and the global random source is automatically seeded. Calling Seed on every invocation in parallel tests can cause race conditions.


247-268: ExtractIDFromResponse ignores keyPath parameter (misleading API).

The function accepts keyPath but never uses it, only checking hardcoded wrappers. This creates a misleading API where callers might think they can specify custom paths.


299-328: GlobalTestData is not thread-safe for parallel tests.

Since tests use t.Parallel(), multiple goroutines may call AddVirtualKey, AddTeam, or AddCustomer concurrently. Appending to slices without synchronization is a data race.


330-367: Cleanup should also be protected by mutex.

If Cleanup runs while another goroutine is adding IDs, there's a race. Also consider adding t.Helper() for better stack traces.

🧹 Nitpick comments (12)
plugins/governance/store_test.go (1)

198-210: Direct store manipulation for test setup.

Lines 201-207 directly access store.budgets to modify budget usage for testing. While this bypasses encapsulation, it's a pragmatic approach for unit tests that need to simulate specific budget states. Consider adding a test helper method if this pattern is used frequently.

plugins/governance/customer_budget_test.go (1)

30-34: Add status check before type assertion on response body.

If the API returns a non-200 status but doesn't fail via the existing check (e.g., returns 201), subsequent type assertions on createCustomerResp.Body could panic.

🔎 Suggested improvement
 	if createCustomerResp.StatusCode != 200 {
 		t.Fatalf("Failed to create customer: status %d", createCustomerResp.StatusCode)
 	}

 	customerID := ExtractIDFromResponse(t, createCustomerResp, "id")
+	if customerID == "" {
+		t.Fatalf("Failed to extract customer ID from response")
+	}
 	testData.AddCustomer(customerID)
plugins/governance/rate_limit_enforcement_test.go (3)

69-125: shouldStop early-exit pattern may skip enforcement verification.

Similar to budget tests, the shouldStop flag (lines 69, 118-124) can cause the loop to exit before a subsequent request triggers the actual rate-limit rejection. However, the test does have a final t.Fatalf at line 127-128 that catches if enforcement was never observed.

The current logic is: if shouldStop is set after consumedTokens >= tokenLimit, the loop breaks on the next iteration before making another request. This means the test relies on the limit being exceeded mid-request rather than verifying a subsequent request is blocked.

🔎 Suggested fix

Remove the shouldStop pattern to ensure the loop continues until an actual rejection is observed:

 	consumedTokens := int64(0)
 	requestNum := 1
-	shouldStop := false

 	for requestNum <= 20 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedTokens >= tokenLimit {
-			shouldStop = true
-		}
 	}

280-336: Same shouldStop early-exit pattern in provider token rate limit test.

Apply the same fix to TestProviderConfigTokenRateLimitEnforcement at lines 280, 329-335.


610-614: Test verifies rate_limit_id existence but doesn't assert meaningful usage tracking.

The test logs whether rateLimitID is empty or not, but doesn't actually verify that usage counters were incremented. Consider asserting that the rate limit's token_current_usage field is greater than zero after the request.

plugins/governance/usage_tracking_test.go (1)

112-128: Rate limit reset test lacks assertion on actual reset behavior.

The test waits 35 seconds for the reset but only logs "Rate limit reset should have occurred" without verifying that token_current_usage actually decreased to 0. Consider reading the rate limit data after reset and asserting usage is near zero.

🔎 Suggested assertion
// After getting getDataResp3
virtualKeysMap3 := getDataResp3.Body["virtual_keys"].(map[string]interface{})
vkData3 := virtualKeysMap3[vkValue].(map[string]interface{})
rateLimitID3, _ := vkData3["rate_limit_id"].(string)

getRateLimitsResp3 := MakeRequest(t, APIRequest{
    Method: "GET",
    Path:   "/api/governance/rate-limits?from_memory=true",
})

if getRateLimitsResp3.StatusCode == 200 {
    rateLimitsMap3 := getRateLimitsResp3.Body["rate_limits"].(map[string]interface{})
    if rateLimit3, ok := rateLimitsMap3[rateLimitID3].(map[string]interface{}); ok {
        tokenUsage, _ := rateLimit3["token_current_usage"].(float64)
        if tokenUsage > 0 {
            t.Fatalf("Rate limit not reset: token_current_usage is %d (expected 0)", int64(tokenUsage))
        }
        t.Logf("Rate limit correctly reset to 0 ✓")
    }
}
plugins/governance/config_update_sync_test.go (2)

54-66: Unsafe chained type assertions may cause test panics.

Multiple chained type assertions without ok checks (e.g., getVKResp1.Body["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})) can panic if the response structure is unexpected. Consider using defensive extraction with ok checks or a helper function.

🔎 Suggested pattern
virtualKeysMap, ok := getVKResp1.Body["virtual_keys"].(map[string]interface{})
if !ok {
    t.Fatalf("Failed to extract virtual_keys from response")
}
vkData1, ok := virtualKeysMap[vkValue].(map[string]interface{})
if !ok {
    t.Fatalf("VK %s not found in response", vkValue)
}

This pattern appears throughout the file and would benefit from a shared extraction helper.


229-233: Clarify or strengthen the "higher limit" branch assertion.

The comment says "usage should NOT reset" when updating to a higher limit, but by this point usage was already reset to 0 by the previous lower-limit update. The assertion at lines 230-232 only logs if usage is non-zero rather than failing. This doesn't verify the intended "preservation" behavior.

plugins/governance/advanced_scenarios_test.go (2)

14-172: Heavy/long-running test with t.Parallel() may cause CI flakiness.

This test loops up to 150 requests to exhaust a budget. Combined with t.Parallel() across many similar tests, this can lead to slow, flaky CI runs due to provider variability or resource contention. Consider gating these behind an integration build tag or env var.

🔎 Suggested approach

Add a build tag or runtime check:

//go:build integration
// +build integration

// Or at runtime:
func TestVKSwitchTeamAfterBudgetExhaustion(t *testing.T) {
    if os.Getenv("RUN_INTEGRATION_TESTS") != "1" {
        t.Skip("Skipping integration test; set RUN_INTEGRATION_TESTS=1 to enable")
    }
    // ...
}

92-133: Budget exhaustion loop may complete without observing rejection.

The loop (lines 92-129) breaks when CheckErrorMessage(t, resp, "budget") returns true, but if the loop completes all 150 iterations without a rejection, the test only fails at line 131-132 based on consumedBudget < team1Budget. This doesn't confirm a budget rejection was actually observed.

Consider tracking sawBudgetRejection to ensure enforcement was witnessed:

🔎 Suggested pattern
sawBudgetRejection := false
for requestNum <= 150 {
    // ... request logic ...
    if resp.StatusCode >= 400 {
        if CheckErrorMessage(t, resp, "budget") {
            sawBudgetRejection = true
            t.Logf("Team1 budget exhausted at request %d", requestNum)
            break
        }
        // ...
    }
    // ...
}

if !sawBudgetRejection {
    t.Fatalf("Did not observe budget rejection after %d requests (consumed: $%.6f)", requestNum-1, consumedBudget)
}
plugins/governance/e2e_test.go (1)

1277-1297: Strengthen assertions for VK deletion behavior.

The test correctly verifies the VK is removed from in-memory store (lines 1262-1275) but then only logs the result when trying to use the deleted VK (lines 1277-1297). Assert that the request is rejected with an appropriate error.

plugins/governance/test_utils.go (1)

369-424: Add t.Helper() to polling utilities for better stack traces.

Both WaitForCondition and WaitForAPICondition are test helpers that log and should call t.Helper() at the start so failures are attributed to the calling test code.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b15b300 and cd49e9e.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • plugins/governance/changelog.md
  • plugins/governance/edge_cases_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/go.mod
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/store_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/resolver_test.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/store_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/resolver_test.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/store_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/resolver_test.go
🧬 Code graph analysis (10)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/customer_budget_test.go (1)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateTeamRequest (183-187)
plugins/governance/usage_tracking_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
plugins/governance/advanced_scenarios_test.go (5)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • UpdateVirtualKeyRequest (210-218)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/store_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
ui/lib/types/governance.ts (4)
  • Customer (37-44)
  • Team (27-35)
  • RateLimit (13-25)
  • Budget (5-11)
plugins/governance/fixtures_test.go (3)
core/schemas/logger.go (2)
  • LogLevel (6-6)
  • LoggerOutputType (17-17)
ui/lib/types/governance.ts (3)
  • Budget (5-11)
  • RateLimit (13-25)
  • VirtualKey (61-80)
plugins/governance/resolver.go (1)
  • Decision (15-15)
plugins/governance/config_update_sync_test.go (3)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
  • WaitForCondition (371-394)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/test_utils.go (1)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
plugins/governance/rate_limit_enforcement_test.go (1)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
plugins/governance/team_budget_test.go (2)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (17)
plugins/governance/fixtures_test.go (1)

70-221: Well-structured test fixtures.

The test data builders follow consistent patterns and provide good abstractions for creating governance entities with various configurations. The assertion helpers appropriately use t.Helper() for cleaner stack traces.

plugins/governance/resolver_test.go (3)

15-37: Good coverage of the happy path.

The test properly initializes the store, creates a virtual key, and validates that an allowed request returns DecisionAllow with the virtual key populated.


39-294: Comprehensive resolver test coverage.

The tests cover a good range of scenarios: VK not found, VK blocked, provider/model filtering, rate limit exceeded (tokens and requests), rate limit expiration/reset, budget exceeded, and budget expiration. The tests are well-organized and use consistent patterns.


403-551: Well-structured table-driven tests and context verification.

The IsProviderAllowed and IsModelAllowed tests use idiomatic table-driven patterns. The context population test properly verifies that governance context values are set correctly after request evaluation.

plugins/governance/team_budget_test.go (1)

8-65: Good test setup for shared budget enforcement.

The test properly creates a team with a small budget ($0.01) and two VKs under that team, then alternates requests between them. The setup includes proper cleanup and resource tracking.

plugins/governance/in_memory_sync_test.go (2)

8-130: Solid in-memory synchronization test for VK updates.

The test properly creates a VK, verifies it exists in-memory, updates the budget, waits for async propagation, and verifies the in-memory store reflects the change. Status checks are in place for most API calls.


431-554: Good endpoint consistency verification with graceful SQLite handling.

The test properly acknowledges that count mismatches can occur due to parallel tests causing SQLite locking, logging a warning instead of failing. This is pragmatic for parallel integration tests.

plugins/governance/store_test.go (4)

17-66: Well-structured table-driven test for VK retrieval.

The test covers active VKs, inactive VKs, and non-existent VKs with proper assertions for both the exists return value and the VK data.


68-101: Good concurrent read stress test.

The test launches 100 goroutines each performing 100 reads, validating that the store handles concurrent access correctly with zero errors.


212-250: Good atomic rate limit update test.

The test properly verifies that token and request usage can be updated independently and that the in-memory state reflects the changes correctly.


326-351: Good budget retrieval and utility tests.

The GetAllBudgets test verifies that multiple budgets can be stored and retrieved. The ptrInt64 utility is a common and useful pattern for creating pointer values in tests.

plugins/governance/customer_budget_test.go (1)

1-6: LGTM on package and imports.

The package declaration and imports are appropriate for this test file.

plugins/governance/rate_limit_enforcement_test.go (1)

1-6: LGTM on package and imports.

The package declaration and imports are appropriate.

plugins/governance/usage_tracking_test.go (1)

1-6: LGTM on package and imports.

The package declaration and imports are appropriate.

plugins/governance/config_update_sync_test.go (1)

1-6: LGTM on package and imports.

The package declaration and imports are appropriate.

plugins/governance/advanced_scenarios_test.go (1)

1-6: LGTM on package and imports.

The package declaration and imports are appropriate.

plugins/governance/test_utils.go (1)

294-297: LGTM - case-insensitive matching now implemented correctly.

The contains function now correctly implements case-insensitive matching using strings.ToLower on both strings, which aligns with the comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (34)
plugins/governance/resolver_test.go (1)

332-346: Add a test helper method to set budget usage instead of directly manipulating internal state.

Line 337 bypasses encapsulation by directly calling store.budgets.Store(), coupling the test to implementation details. This issue was previously flagged.

🔎 Recommended fix: Add test helper method

Add this method to LocalGovernanceStore:

// SetBudgetCurrentUsageForTest is a test-only helper to set budget usage without tracking actual cost
func (gs *LocalGovernanceStore) SetBudgetCurrentUsageForTest(budgetID string, usage float64) error {
	if val, ok := gs.budgets.Load(budgetID); ok {
		budget := val.(*configstoreTables.TableBudget)
		budget.CurrentUsage = usage
		gs.budgets.Store(budgetID, budget)
		return nil
	}
	return fmt.Errorf("budget not found: %s", budgetID)
}

Then update the test:

-	// Get the governance data to update the budget directly
-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	// Update budget usage for test scenario
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)
plugins/governance/test_utils.go (6)

247-268: keyPath parameter is ignored (misleading API).

The function accepts keyPath but never uses it—only checking hardcoded wrappers. This creates a misleading API where callers think they can specify custom paths.

Either implement keyPath traversal or remove the unused parameter:

-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse) string {
+	t.Helper()

90-141: Add t.Helper() and HTTP client timeout to prevent test hangs.

MakeRequest is missing two standard test helper patterns:

  1. No t.Helper(): failures report line numbers inside the helper instead of the calling test
  2. No timeout on http.Client: tests hang indefinitely if the server is unavailable
 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+	t.Helper()
-	client := &http.Client{}
+	client := &http.Client{Timeout: 30 * time.Second}
 	url := fmt.Sprintf("http://localhost:8080%s", req.Path)

143-152: Remove deprecated rand.Seed call.

rand.Seed is deprecated since Go 1.20 and the global random source is automatically seeded. Calling Seed on every invocation in parallel tests also causes race conditions.

 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

299-328: GlobalTestData is not thread-safe for parallel tests.

Since tests use t.Parallel(), multiple goroutines may call AddVirtualKey, AddTeam, or AddCustomer concurrently. Appending to slices without synchronization is a data race.

+import "sync"

 type GlobalTestData struct {
+	mu          sync.Mutex
 	VirtualKeys []string
 	Teams       []string
 	Customers   []string
 }

 func (g *GlobalTestData) AddVirtualKey(id string) {
+	g.mu.Lock()
+	defer g.mu.Unlock()
 	g.VirtualKeys = append(g.VirtualKeys, id)
 }

Apply the same pattern to AddTeam and AddCustomer.


330-367: Cleanup should also be protected by mutex and use t.Helper().

If Cleanup runs while another goroutine is adding IDs, there's a race. Add mutex protection and t.Helper() for better stack traces.

 func (g *GlobalTestData) Cleanup(t *testing.T) {
+	t.Helper()
+	g.mu.Lock()
+	defer g.mu.Unlock()
+
 	// Delete virtual keys

63-73: CalculateCost errors are silently ignored at all call sites.

All call sites use cost, _ := CalculateCost(...), so an unknown model silently returns 0 and tests appear to pass while budgets are never consumed. Consider either panicking on unknown models or updating call sites to require.NoError(t, err).

plugins/governance/usage_tracking_test.go (6)

337-346: Test verifies rate_limit_id existence but VK was created without rate limit.

TestInMemoryUsageUpdateOnRequest creates a VK with no rate limit configuration (Lines 264-271), so rate_limit_id will always be empty. The assertion at Lines 339-343 is effectively a no-op. Consider adding a rate limit to the VK or verifying budget tracking instead.


112-128: Rate limit reset test lacks assertion on actual reset behavior.

The test waits 35 seconds for the reset but only logs "Rate limit reset should have occurred" without verifying actual values. Add assertion that token_current_usage decreased to 0:

// After getDataResp3
virtualKeysMap3 := getDataResp3.Body["virtual_keys"].(map[string]interface{})
vkData3 := virtualKeysMap3[vkValue].(map[string]interface{})
rateLimitID3, _ := vkData3["rate_limit_id"].(string)

getRateLimitsResp := MakeRequest(t, APIRequest{
    Method: "GET",
    Path:   "/api/governance/rate-limits?from_memory=true",
})
rateLimitsMap := getRateLimitsResp.Body["rate_limits"].(map[string]interface{})
if rl, ok := rateLimitsMap[rateLimitID3].(map[string]interface{}); ok {
    tokenUsage, _ := rl["token_current_usage"].(float64)
    if tokenUsage > 0 {
        t.Fatalf("Rate limit not reset: token_current_usage is %.0f (expected 0)", tokenUsage)
    }
}
t.Logf("Rate limit correctly reset to 0 ✓")

165-178: Missing status checks before type assertions.

getVKResp and getBudgetsResp are accessed without checking StatusCode, which could panic on non-200 responses.

 	getVKResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})
+
+	if getVKResp.StatusCode != 200 {
+		t.Fatalf("Failed to get virtual keys: status %d", getVKResp.StatusCode)
+	}

 	virtualKeysMap := getVKResp.Body["virtual_keys"].(map[string]interface{})

 	getBudgetsResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})
+
+	if getBudgetsResp.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d", getBudgetsResp.StatusCode)
+	}

 	budgetsMap := getBudgetsResp.Body["budgets"].(map[string]interface{})

396-401: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. For i >= 10, this produces unexpected characters.

+import "strconv"
+
 // in the loop:
-			Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+			Content: "Test request " + strconv.Itoa(i) + " for reset ticker test.",

519-524: Same fragile string conversion issue.

-			Content: "Persistence test request " + string(rune('0'+i)) + ".",
+			Content: "Persistence test request " + strconv.Itoa(i) + ".",

Add "strconv" to the imports.


413-426: Missing status checks before type assertions.

Same issue: getVKResp and getBudgetsResp lack status checks before accessing body fields, which could cause panics.

plugins/governance/tracker_test.go (2)

15-16: Mismatched comment and function name.

The comment says TestUsageTracker_UpdateUsage_Successful but the function is TestUsageTracker_UpdateUsage_FailedRequest and tests that failed requests don't update budgets (Line 36: Success: false).

-// TestUsageTracker_UpdateUsage_Successful tests successful usage tracking
+// TestUsageTracker_UpdateUsage_FailedRequest tests that failed requests don't update budgets
 func TestUsageTracker_UpdateUsage_FailedRequest(t *testing.T) {

154-155: Mismatched comment and function name.

The comment says TestUsageTracker_UpdateBudgetHierarchy but the function is named TestUsageTracker_Cleanup.

-// TestUsageTracker_UpdateBudgetHierarchy tests multi-level budget updates
+// TestUsageTracker_Cleanup tests cleanup behavior
 func TestUsageTracker_Cleanup(t *testing.T) {
plugins/governance/vk_budget_test.go (1)

46-127: Remove shouldStop flag to ensure budget enforcement is actually tested.

The shouldStop logic causes the loop to exit before sending a request that would trigger server-side budget rejection. When consumedBudget >= vkBudget, shouldStop is set, and on the next iteration the loop breaks—but the over-budget request is never sent to verify enforcement.

Let the API's enforcement (or the 50-request cap) terminate the loop:

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= vkBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/team_budget_test.go (1)

66-156: Remove shouldStop flag so the test actually exercises team-budget enforcement.

Same issue as in vk_budget_test.go: the shouldStop flag causes the loop to exit before sending an over-budget request. The loop can exit before ever sending a request that would trigger the expected 4xx "budget exceeded" response.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= teamBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/rate_limit_enforcement_test.go (3)

481-505: Replace fragile string(rune('0'+i)) with proper integer formatting.

In TestProviderAndVKRateLimitBothEnforced, the request content uses:

Content: "Request " + string(rune('0'+i)) + " to test both limits.",

This only produces correct digits for i in [0,9] and is generally unclear.

Use fmt.Sprintf or strconv.Itoa instead (you already import fmt in other files):

-						Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+						Content: fmt.Sprintf("Request %d to test both limits.", i+1),

(plus import "fmt" at the top if not already present).


521-615: Strengthen TestRateLimitInMemoryUsageTracking by asserting usage counters, not just ID presence.

This test currently only asserts that rate_limit_id exists and logs that it is “configured and tracking usage”, but it never verifies:

  • That any token/request usage increased after the request, or
  • That the usage values in /api/governance/rate-limits?from_memory=true match expectations.

To make this a real guardrail:

  • Fetch the rate-limit entry from the rate-limits map by rateLimitID.
  • Read token_current_usage and/or request_current_usage as float64.
  • Assert they are > 0 (or >= tokensUsed from the response), failing the test if not.

Sketch:

rateLimitsResp := MakeRequest(t, APIRequest{Method: "GET", Path: "/api/governance/rate-limits?from_memory=true"})
if rateLimitsResp.StatusCode != 200 {
	t.Fatalf("Failed to get rate limits: status=%d", rateLimitsResp.StatusCode)
}
	rateLimits, ok := rateLimitsResp.Body["rate_limits"].(map[string]interface{})
	if !ok {
		t.Fatalf("rate_limits missing or wrong type in response")
	}
	rl, ok := rateLimits[rateLimitID].(map[string]interface{})
	if !ok {
		t.Fatalf("rate limit %s not found in map", rateLimitID)
	}
	usage, _ := rl["token_current_usage"].(float64)
	if usage <= 0 {
		t.Fatalf("Token usage not tracked: expected > 0, got %.0f", usage)
	}

That way, the test fails if in-memory usage tracking stops working instead of silently passing.


67-125: Drop shouldStop so token-limit tests actually exercise the POST‑HOC blocked request.

Both TestVirtualKeyTokenRateLimitEnforcement and TestProviderConfigTokenRateLimitEnforcement describe POST‑HOC behavior (“request that exceeds is allowed, next is blocked”) but the loops use shouldStop to break as soon as consumedTokens >= tokenLimit / providerTokenLimit (Lines 69-124 and 280-335), before necessarily sending that next request.

This means the tests:

  • May never send the request that should be rejected after the limit is exceeded, and
  • Rely on the excess request itself being rejected, which contradicts the documented semantics.

Simplify the loops to:

  • Remove the shouldStop flag entirely.
  • Keep the existing early return when a rate-limit rejection is actually observed.
  • Let the bounded for terminate naturally and hit the final t.Fatalf(...) if no rate-limit response is ever seen.

Conceptually:

-	consumedTokens := int64(0)
-	requestNum := 1
-	shouldStop := false
+	consumedTokens := int64(0)
+	requestNum := 1
@@
-		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedTokens >= tokenLimit {
-			shouldStop = true
-		}
+		requestNum++

Apply the same shape in the provider-token enforcement test.

Also applies to: 278-336

plugins/governance/advanced_scenarios_test.go (5)

1245-1271: Avoid manual-only deletion; always register created entities with testData for fail-safe cleanup.

Several tests deliberately skip testData.Add*:

  • TestTeamDeletionDeletesBudget (Line 1270: “Don’t add to testData since we’ll delete manually”)
  • TestCustomerDeletionDeletesBudget (Line 1355)
  • TestTeamDeletionSetsVKTeamIDToNil (Line 1440)
  • TestCustomerDeletionSetsVKCustomerIDToNil (Line 1538)

If any of these tests fail or panic before reaching their explicit DELETE calls, the created teams/customers will leak and can pollute subsequent tests.

The GlobalTestData cleanup already tolerates 404s, so double-deletion is safe. Prefer always registering created entities:

 teamID := ExtractIDFromResponse(t, createTeamResp, "id")
-// Don't add to testData since we'll delete manually
+testData.AddTeam(teamID)

(and similarly for customers).

This keeps resource lifecycle consistent and makes cleanup robust even under early failures.

Also applies to: 1330-1356, 1419-1441, 1517-1539


15-171: Gate or de‑parallelize these long‑running 150‑request scenarios to avoid flaky, slow CI.

Scenarios 1–7 (VK/Team/Customer/provider budget switching and “update after exhaustion” tests) all:

  • Use t.Parallel(), and
  • Can send up to 150 real /v1/chat/completions calls each, often with long prompts.

Running all of these in parallel in default CI is:

  • Slow and costly, and
  • Prone to flakes due to provider variability, network issues, or rate/budget limits outside the code under test.

Consider one (or both) of:

  • Remove t.Parallel() from these heavy tests so they run serially within the package; or

  • Gate them behind an integration flag/build tag, e.g.:

    if os.Getenv("INTEGRATION_TESTS") != "1" {
        t.Skip("Skipping heavy governance integration scenario; enable with INTEGRATION_TESTS=1")
    }

and/or a //go:build integration tag at the top of the file.

This keeps the core unit/fast integration tests reliable while still allowing full scenarios to run in specialized pipelines.

Also applies to: 180-334, 342-528, 535-933, 940-1074


92-133: Add explicit sawBudgetRejection assertions so “after exhaustion” scenarios don’t pass without ever seeing a budget error.

Across scenarios:

  • TestVKSwitchTeamAfterBudgetExhaustion (Lines 92-133)
  • TestVKSwitchCustomerAfterBudgetExhaustion (257-297)
  • TestHierarchicalChainBudgetSwitch (455-491)
  • TestVKBudgetUpdateAfterExhaustion (571-604)
  • TestTeamBudgetUpdateAfterExhaustion (706-738)
  • TestCustomerBudgetUpdateAfterExhaustion (859-891)
  • TestProviderConfigBudgetUpdateAfterExhaustion (994-1026)

the pattern is:

  • Loop up to 150 times trying to “exhaust” a budget.
  • Break on a budget error, otherwise just track consumedBudget.
  • After the loop, either check only consumedBudget or proceed directly to a budget update/switch without asserting that a budget‑rejection response was ever observed.

If the API never returns a budget error (e.g., regression in enforcement or pricing changes), these tests can still pass after 150 successful requests, especially in the “update after exhaustion” tests (Scenarios 4–7).

Strengthen all of these loops by:

  1. Introducing a sawBudgetRejection := false flag before the loop.

  2. Setting it to true when CheckErrorMessage(t, resp, "budget") is hit and then break.

  3. After the loop, asserting:

    if !sawBudgetRejection {
        t.Fatalf("Did not observe budget exhaustion before %s step (requests=%d, consumed=$%.6f)",
            "<switch / update description>", requestNum-1, consumedBudget)
    }

Only then proceed to VK/team/customer/provider-config switch or budget update logic. This matches the test names (“after exhaustion”) and ensures regressions in enforcement cannot silently slip through.

Also applies to: 257-297, 455-491, 571-604, 706-738, 859-891, 994-1026


1080-1235: Make rate‑limit cascade behavior in TestVKDeletionCascadeComplete consistent with 1:1 invariants.

This test correctly asserts that:

  • The VK is removed from memory, and
  • Both VK‑level and provider‑config budgets are gone (Lines 1168-1228).

But for rate limits, it still allows orphans:

_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
if vkRateLimitStillExists || pcRateLimitStillExists {
	t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
}

Per the stored invariant (budgets and rate limits are 1:1 with their parent entities), rate limits should not remain once the only owning VK/provider-config is deleted. Accepting orphans contradicts that model and weakens this test as a guardrail. Based on learnings, ...

Change this to fail when rate limits remain, mirroring the budget checks:

-	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
-	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
+	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}

If design intentionally allows rate-limit reuse across entities, the data model and learnings should be updated, and this test should instead assert the documented behavior explicitly.


1245-1328: Deletion-with-budget tests don’t assert budget removal; add checks on /budgets map.

TestTeamDeletionDeletesBudget and TestCustomerDeletionDeletesBudget:

  • Capture budgetID from the in-memory store before deletion (Lines 1290, 1375) and assert it exists initially,
  • Delete the team/customer, then
  • Only check that the team/customer is gone from its own map.

They never assert that budgetID was removed from /api/governance/budgets?from_memory=true, despite the test names claiming “deletes budget”.

Augment each test after verifying team/customer removal:

// Verify budget is also removed
getBudgetsResp2 := MakeRequest(t, APIRequest{
    Method: "GET",
    Path:   "/api/governance/budgets?from_memory=true",
})
budgetsMap2 := getBudgetsResp2.Body["budgets"].(map[string]interface{})
if _, stillExists := budgetsMap2[budgetID]; stillExists {
    t.Fatalf("Budget should be deleted with %s but still exists: %s", "<team/customer>", budgetID)
}
t.Logf("Budget removed from memory ✓")

This ensures cascade deletion behavior for budgets is actually enforced by the system.

Also applies to: 1330-1413

plugins/governance/e2e_test.go (5)

302-302: Replace all string(rune('0'+i))/mod‑10 tricks with fmt.Sprintf / strconv.Itoa.

Several tests build request content using fragile integer‑to‑string hacks:

  • TestFullBudgetHierarchyEnforcement (Line 302)
  • TestRateLimitTokenAndRequestTogether (Line 643)
  • Similar pattern elsewhere in this file and in rate_limit_test.go.

These only work properly for a narrow range of i and are non-obvious.

Prefer:

Content: fmt.Sprintf("Test hierarchy enforcement request %d", requestNum),

or:

Content: "Request number " + strconv.Itoa(i+1) + ".",

(adding the appropriate import at the top).

Also applies to: 643-643, 1380-1385


1366-1425: Fix provider detection in TestWeightedProviderLoadBalancing (don’t treat azure/gpt-4o as OpenAI).

The weighted LB test currently classifies providers via substring matching:

if model, ok := resp.Body["model"].(string); ok {
    if strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o") {
        openaiCount++
    } else if strings.Contains(model, "azure") {
        azureCount++
    }
}

This misclassifies models like "azure/gpt-4o" as OpenAI (because of "gpt-4o"), skewing counts and making the “distribution” assertions unreliable.

Parse the provider prefix explicitly instead:

-		if model, ok := resp.Body["model"].(string); ok {
-			if strings.Contains(model, "openai") || strings.Contains(model, "gpt-4o") {
-				openaiCount++
-				t.Logf("Request %d routed to OpenAI (model: %s)", i+1, model)
-			} else if strings.Contains(model, "azure") {
-				azureCount++
-				t.Logf("Request %d routed to Azure (model: %s)", i+1, model)
-			}
-		}
+		if model, ok := resp.Body["model"].(string); ok {
+			parts := strings.SplitN(model, "/", 2)
+			provider := parts[0]
+			switch provider {
+			case "openai":
+				openaiCount++
+				t.Logf("Request %d routed to OpenAI (model: %s)", i+1, model)
+			case "azure":
+				azureCount++
+				t.Logf("Request %d routed to Azure (model: %s)", i+1, model)
+			}
+		}

This aligns the test with actual provider prefixes and keeps the load-balancing verification meaningful.


96-148: Remove shouldStop and rely on the loop bounds + explicit budget rejections.

In TestMultipleVKsSharingTeamBudgetFairness and TestFullBudgetHierarchyEnforcement you use:

consumedBudget := 0.0
requestNum := 1
shouldStop := false
...
requestNum++
if shouldStop {
    break
}
if consumedBudget >= teamBudget/providerBudget {
    shouldStop = true
}

This can break the loop purely on local consumedBudget estimates, without requiring that:

  • A VK request ever receives a budget‑exceeded error, or
  • A “post‑limit” request is actually blocked (despite comments describing POST‑HOC enforcement).

Both tests already have:

  • Bounded for loops (requestNum <= 150 / <= 20), and
  • Clear success paths that return when a budget error is observed, plus t.Fatalf at the end when no error occurs.

You can safely drop shouldStop and let the loop either:

  • Exit early on the first budget‑rejection response (via existing return/break), or
  • Run to its bound and then t.Fatalf if enforcement never happens.

This makes the tests’ control flow simpler and aligns them with the intended “observe actual server‑side budget enforcement” semantics.

Also applies to: 286-351


1080-1187: Strengthen team/VK deletion cascade e2e tests with real assertions instead of logs.

TestTeamDeletionCascade and TestVKDeletionCascade currently:

  • Verify the pre-delete VK works / exists (good).

  • Perform the DELETE.

  • After a short sleep, only log the status of a subsequent VK request:

    t.Logf("VK request after team deletion: status=%d", resp2.StatusCode)
    ...
    if resp.StatusCode < 400 {
        t.Logf("Deleted VK still accepts requests ... - may be cached in SDK")
    } else {
        t.Logf("Deleted VK request rejected ...")
    }

No assertions are made about expected behavior, so these tests will “pass” regardless of whether:

  • VKs are supposed to be blocked after team deletion, or
  • Deleted VKs are allowed to continue working for some window.

Given the names (“DeletionCascade”), define a deterministic expectation and assert it:

  • If VKs must be disabled after team/VK deletion, require resp.StatusCode >= 400 and check the error message.
  • If VKs are allowed but must no longer bill team/customer budget, assert on in-memory team/customer/budget state instead.
  • In either case, fail the test when observed behavior doesn’t match the chosen policy.

Right now these tests don’t protect anything from regressions.

Also applies to: 1193-1299


1439-1531: Make the fallback provider test actually assert fallback behavior and provider identity.

TestProviderFallbackMechanism:

  • Sends 5 gpt-4o requests through a VK whose primary provider (Anthropic) does not allow gpt-4o and whose fallback (OpenAI) does.
  • Only checks successCount and logs the response model as “likely via OpenAI fallback”.
  • Never asserts:
    • That all (or even most) requests succeeded, or
    • That the provider used was OpenAI rather than Anthropic.

To turn this into a real guardrail:

  1. Require all 5 requests to succeed (or define an acceptable minimum).
  2. Parse resp.Body["model"].(string) like in the weighted LB test to extract provider prefix.
  3. Assert that no responses are served by Anthropic for this model, and at least one (ideally all) are served by OpenAI.

Example strengthening:

openaiCount := 0
anthropicCount := 0
...
if resp.StatusCode == 200 {
    model, _ := resp.Body["model"].(string)
    parts := strings.SplitN(model, "/", 2)
    provider := parts[0]
    switch provider {
    case "openai":
        openaiCount++
    case "anthropic":
        anthropicCount++
    }
}
...
if openaiCount == 0 {
    t.Fatalf("Expected fallback to OpenAI for model gpt-4o, but saw 0 OpenAI responses (anthropic=%d)", anthropicCount)
}
if anthropicCount > 0 {
    t.Fatalf("Anthropic should not handle gpt-4o responses but did: %d", anthropicCount)
}

Right now the test can pass even if fallback is broken and Anthropic (or nothing) is serving these requests.

plugins/governance/rate_limit_test.go (2)

54-55: Fix fragile request numbering strings in rate-limit tests.

Multiple tests build request content with:

"Short test request " + string(rune('0'+i)) + ...
"Request number " + string(rune('0'+i)) + "."
"Provider token limit test " + string(rune('0'+i)) + "."
"Provider request limit test " + string(rune('0'+i)) + "."
"Test request for token and request limits " + string(rune('0'+i)) + "."

This only behaves correctly for i within [0,9] and is non-obvious.

Use proper integer formatting instead, e.g.:

-Content: "Short test request " + string(rune('0'+i)) + " for token limit.",
+Content: fmt.Sprintf("Short test request %d for token limit.", i+1),

(or strconv.Itoa if you prefer not to import fmt). Apply similarly in the other tests.

Also applies to: 127-128, 206-207, 285-286, 643-644


43-79: Rate-limit tests should fail if no rate-limit rejection is observed, not just log/skip.

In:

  • TestVirtualKeyTokenRateLimit (Lines 43-79)
  • TestVirtualKeyRequestRateLimit (116-152)
  • TestProviderConfigTokenRateLimit (195-231)
  • TestProviderConfigRequestRateLimit (274-310)

the structure is:

  • Loop making up to N requests.
  • If a 4xx+ with “token/request/rate” in the error is seen, log and return (test passes).
  • Otherwise, track successCount; at the end, log Made %d successful requests or t.Skip when successCount == 0.

That means:

  • If rate limiting is never enforced (all requests succeed), the tests still pass with a log line, even though the comment/docstrings say they “test that ... limits are enforced”.

Instead, after the loop, fail if no rate-limit rejection was seen. A simple pattern for each test:

hitRateLimit := false
for i := 0; i < attempts; i++ {
    resp := MakeRequest(...)
    if resp.StatusCode >= 400 {
        if CheckErrorMessage(t, resp, "token") || CheckErrorMessage(t, resp, "rate") {
            t.Logf("Request %d correctly rejected due to token rate limit", i+1)
            hitRateLimit = true
            break
        }
        t.Fatalf("Request %d failed with unexpected error: %v", i+1, resp.Body)
    }
    // track successCount...
}

if !hitRateLimit {
    t.Fatalf("Rate limit not enforced: sent %d requests without any rate-limit rejection", attempts)
}

Apply the equivalent logic for request‑based and provider‑level tests.

Also applies to: 116-152, 195-231, 274-310

plugins/governance/customer_budget_test.go (1)

67-156: Remove shouldStop early-exit logic so tests actually observe server-side budget rejection.

Both TestCustomerBudgetExceededWithMultipleVKs and TestCustomerBudgetExceededWithMultipleTeams use a shouldStop flag (Lines 70, 149-155 and 245, 324-330) to break the loop as soon as local consumedBudget >= customerBudget. This can terminate the loop before sending the next request that should be rejected by the server when the customer budget is exceeded, so the tests might never see the actual enforcement response and only rely on local cost estimates.

Drop shouldStop entirely and let the bounded for requestNum <= 50 loop continue until either:

  • A request fails with the expected budget/customer error (test passes and returns), or
  • The loop exhausts attempts, in which case the final t.Fatalf(...) correctly reports that budget was never enforced.

A minimal change for each loop:

-	consumedBudget := 0.0
-	requestNum := 1
-	var lastSuccessfulCost float64
-	var shouldStop = false
+	consumedBudget := 0.0
+	requestNum := 1
+	var lastSuccessfulCost float64
 	vkIndex := 0
@@
-		requestNum++
-		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= customerBudget {
-			shouldStop = true
-		}
+		requestNum++
+		vkIndex++

Apply the same pattern in the multi-team test.

Also applies to: 242-331

plugins/governance/config_update_sync_test.go (1)

48-66: Guard all GET responses with status/shape checks before indexing into Body maps.

Patterns like:

getVKResp1 := MakeRequest(t, APIRequest{Method: "GET", Path: "/api/governance/virtual-keys?from_memory=true"})
vkData1 := getVKResp1.Body["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})

getRateLimitsResp1 := MakeRequest(t, APIRequest{Method: "GET", Path: "/api/governance/rate-limits?from_memory=true"})
rateLimitsMap1 := getRateLimitsResp1.Body["rate_limits"].(map[string]interface{})
rateLimit1 := rateLimitsMap1[rateLimitID1].(map[string]interface{})

(and similar blocks below) assume:

  • StatusCode == 200, and
  • Body[...] keys exist and are of the expected types.

If the API returns a non‑200 or a differently-shaped error body, these assertions will panic and obscure the real problem.

Refactor to:

  1. Assert StatusCode == 200 before reading Body.
  2. Use comma‑ok assertions at each step and t.Fatalf with a clear message when shape is wrong.

For example:

getVKResp1 := MakeRequest(t, APIRequest{Method: "GET", Path: "/api/governance/virtual-keys?from_memory=true"})
if getVKResp1.StatusCode != 200 {
	t.Fatalf("Failed to get VKs from memory: status=%d body=%v", getVKResp1.StatusCode, getVKResp1.Body)
}
	vks, ok := getVKResp1.Body["virtual_keys"].(map[string]interface{})
	if !ok {
		t.Fatalf("virtual_keys missing or wrong type in response: %v", getVKResp1.Body)
	}
	vkData1, ok := vks[vkValue].(map[string]interface{})
	if !ok {
		t.Fatalf("VK %s not found in memory response", vkValue)
	}

Apply analogous checks for all /virtual-keys, /rate-limits, /budgets, /teams, and /customers GETs in this file.

Also applies to: 97-114, 150-169, 273-291, 305-375, 438-466, 495-513

🧹 Nitpick comments (2)
plugins/governance/tracker_test.go (1)

56-82: No-op assertion doesn't verify behavior.

assert.True(t, true) at Line 81 always passes and doesn't verify anything meaningful. This test should either verify specific error logging occurred or check that the store state remained unchanged.

-	// Just verify it doesn't crash
-	assert.True(t, true)
+	// Verify store is still empty (no VK was added)
+	govData := store.GetGovernanceData()
+	assert.Empty(t, govData.VirtualKeys, "Store should remain empty for non-existent VK")
plugins/governance/config_update_sync_test.go (1)

94-99: Prefer WaitForCondition/polling over fixed time.Sleep for async sync.

Several tests rely on hard-coded sleeps (e.g., time.Sleep(500 * time.Millisecond) / 1500 * time.Millisecond) to wait for in-memory updates after a request or config change. This is brittle and can cause flakes under slow CI or load; you already have WaitForCondition/WaitForAPICondition utilities in test_utils.go.

Consider replacing sleeps with polling, e.g.:

var tokenUsageAfterUpdate float64
ok := WaitForCondition(t, func() bool {
    resp := MakeRequest(t, APIRequest{Method: "GET", Path: "/api/governance/rate-limits?from_memory=true"})
    if resp.StatusCode != 200 {
        return false
    }
    // ... extract and assign tokenUsageAfterUpdate ...
    return true // when the expected state is visible
}, 3*time.Second, "rate limit update reflected in memory")
if !ok {
    t.Fatalf("rate limit update did not sync to memory in time")
}

This will make the sync tests more deterministic and CI-friendly.

Also applies to: 147-149, 319-321, 487-488, 541-542, 819-821, 887-888, 1050-1051

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b15b300 and cd49e9e.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/changelog.md
  • plugins/governance/resolver_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/tracker_test.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/resolver_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/tracker_test.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/resolver_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/tracker_test.go
🧬 Code graph analysis (5)
plugins/governance/config_update_sync_test.go (1)
plugins/governance/test_utils.go (14)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • WaitForCondition (371-394)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
plugins/governance/usage_tracking_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
plugins/governance/e2e_test.go (5)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateCustomerRequest (190-193)
  • APIResponse (84-88)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (1)
  • Role (17-17)
framework/configstore/tables/utils.go (1)
  • ParseDuration (9-43)
plugins/governance/team_budget_test.go (5)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/rate_limit_test.go (2)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (80)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (8)
plugins/governance/changelog.md (1)

1-3: Changelog entries look good.

Both previously flagged grammar issues have been corrected ("an interface" and "e2e tests"). The entries are well-formatted, clear, and align with the scope of governance infrastructure and testing changes in this PR.

plugins/governance/resolver_test.go (1)

1-551: Comprehensive test coverage with clear test organization.

The test suite thoroughly covers BudgetResolver.EvaluateRequest functionality including:

  • Request allow/deny scenarios (VK not found, blocked, provider/model filtering)
  • Rate limit enforcement (token, request, expiration, provider-level)
  • Budget enforcement (exceeded, expired, multi-level hierarchy)
  • Helper method unit tests (provider/model filtering logic)
  • Context population verification

Tests follow good patterns with descriptive names, clear setup/execution/assertion phases, and appropriate use of table-driven tests for filtering logic.

plugins/governance/tracker_test.go (1)

84-152: LGTM!

The streaming optimization test properly verifies non-final vs. final chunk behavior for rate limit tracking. The assertions correctly check that tokens are updated on non-final chunks and request counter increments only on the final chunk.

plugins/governance/test_utils.go (1)

369-424: LGTM!

The WaitForCondition and WaitForAPICondition utilities implement proper progressive backoff polling patterns with clear logging. These are well-designed for testing async updates.

plugins/governance/vk_budget_test.go (1)

8-45: LGTM!

Test setup correctly creates a VK with a small budget ($0.01) and registers it for cleanup. The use of t.Parallel() and generateRandomID() ensures test isolation.

plugins/governance/team_budget_test.go (1)

8-65: LGTM!

Test setup correctly creates a team with budget and two VKs under it with higher individual budgets, ensuring the team budget is the limiting factor. The VK alternation pattern (Line 75) properly tests shared budget enforcement.

plugins/governance/usage_tracking_test.go (2)

248-255: LGTM!

TestUsageTrackingBudgetReset correctly asserts that budget usage is reset to near-zero after the timeout period with a proper tolerance check (> 0.001).


459-464: LGTM!

TestResetTickerBothBudgetAndRateLimit correctly asserts that budget decreased after reset with the condition usageAfterReset >= usageBeforeReset.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from cd49e9e to 4f8ebad Compare December 22, 2025 16:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (25)
plugins/governance/resolver_test.go (1)

332-346: Direct manipulation of internal store.budgets bypasses encapsulation.

Line 338 directly accesses store.budgets.Store(), coupling the test to internal implementation details. This could break if the store's internal structure changes.

Consider adding a test helper method to LocalGovernanceStore:

// SetBudgetCurrentUsageForTest is a test-only helper
func (gs *LocalGovernanceStore) SetBudgetCurrentUsageForTest(budgetID string, usage float64) error {
    if val, ok := gs.budgets.Load(budgetID); ok {
        budget := val.(*configstoreTables.TableBudget)
        budget.CurrentUsage = usage
        gs.budgets.Store(budgetID, budget)
        return nil
    }
    return fmt.Errorf("budget not found: %s", budgetID)
}

Then update the test:

-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)
plugins/governance/go.mod (1)

24-25: Duplicate dependency entry still present.

The github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 dependency appears twice. Run go mod tidy to clean up the module file.

plugins/governance/edge_cases_test.go (2)

60-77: VK budget is more restrictive than provider budget - test expectation mismatch.

The comment on line 101 states "Provider budget should be the limiting factor (most restrictive)", but:

  • VK budget: $0.01 (line 61)
  • Provider budget: $0.08 (line 77)

The VK-level budget is actually more restrictive. Either adjust the budget values or update the test expectation.


105-183: Remove shouldStop flag - causes early loop exit before budget enforcement is observed.

The shouldStop pattern (lines 105, 177-183) causes the loop to exit immediately after the client-side threshold is crossed, before sending a request that would trigger server-side budget rejection. This prevents the test from actually validating that the API enforces the budget.

🔎 Suggested fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
-		}
 	}
plugins/governance/provider_budget_test.go (2)

63-140: Remove shouldStop flag in OpenAI subtest - same early-exit issue.

The shouldStop logic (lines 63, 134-140) causes the loop to exit before the over-budget request is sent to the API. This prevents validation of server-side budget enforcement.

🔎 Suggested fix
 		providerBudget := 0.01
 		consumedBudget := 0.0
 		requestNum := 1
 		var lastSuccessfulCost float64
-		var shouldStop = false

 		for requestNum <= 50 {
 			// ... request logic ...

 			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
 		}

153-230: Remove shouldStop flag in Anthropic subtest - same pattern issue.

Same early-exit problem as the OpenAI subtest. Lines 153 and 224-230 should be removed to allow the API to reject the over-budget request.

plugins/governance/vk_budget_test.go (1)

46-127: Remove shouldStop flag—it prevents the test from validating server-side budget enforcement.

The loop exits when consumedBudget >= vkBudget before sending a request that would exceed the budget. This prevents the test from confirming that the API actually rejects over-budget requests. The test should continue looping until the API returns a 400+ status (or the 50-request cap is hit), not break early.

 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= vkBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/usage_tracking_test.go (5)

511-523: Prefer strconv.Itoa (or fmt.Sprintf) for request numbering.

string(rune('0'+i)) only behaves as expected for single‑digit indices and is non‑idiomatic even though this loop currently runs i < 10.

Replace with a proper integer→string conversion for robustness and clarity.


99-128: Rate-limit reset test only logs, never asserts actual reset.

After waiting 35s you only log that reset “should have occurred” and don’t inspect token_current_usage for the VK’s rate limit. This means the test will pass even if the reset ticker is broken.

Consider fetching /api/governance/rate-limits?from_memory=true after getDataResp3, locating the VK’s rate limit by rateLimitID, and asserting the usage dropped to ~0 (with a small epsilon).


165-223: Add status checks around budgets/virtual-keys GETs in budget reset test.

getVKResp, getBudgetsResp2, and getBudgetsResp3 are used without verifying StatusCode == 200 before type assertions on Body["virtual_keys"] / Body["budgets"]. Any non‑200 (e.g. transient 5xx) will panic and hide the real failure.

Add consistent status checks (like you already do elsewhere in this file) before indexing into Body.

Also applies to: 230-248


257-346: In‑memory usage test doesn’t assert any actual usage tracking.

TestInMemoryUsageUpdateOnRequest currently only:

  • Confirms a request succeeded and had non‑zero tokens.
  • Fetches /virtual-keys?from_memory=true and logs whether rate_limit_id is set.

It never checks that any usage counter (rate‑limit or budget) actually changed, so it won’t catch regressions in in‑memory tracking.

Consider configuring a small rate limit or budget on the VK, issuing a request, then asserting the corresponding usage field increased (or budget decreased) in the in‑memory data.


389-461: Reset ticker test should harden GET status checks and avoid fragile request numbering.

Two issues here:

  • getVKResp, getBudgetsResp, and getBudgetsResp2 are used without checking StatusCode, so non‑200s will panic rather than fail cleanly.
  • Request labels use string(rune('0'+i)), which is a fragile int→string conversion pattern.

Prefer explicit status checks and fmt.Sprintf/strconv.Itoa for the index.

plugins/governance/e2e_test.go (2)

292-352: Avoid fragile integer→string conversion in hierarchy enforcement test.

TestFullBudgetHierarchyEnforcement uses:

Content: "Test hierarchy enforcement request " + string(rune('0'+requestNum%10)),

This only behaves as expected for single‑digit values and is non‑idiomatic.

Prefer fmt.Sprintf("Test hierarchy enforcement request %d", requestNum) (or strconv.Itoa) to make the test robust and clearer.


395-421: Add status checks before indexing into Body in E2E budget / VK tests.

Several tests (TestFailedRequestsDoNotConsumeBudget, TestBudgetStateAfterReset, TestVKDeletionCascade, TestWeightedProviderLoadBalancing) directly access maps like:

virtualKeysMap := getDataResp1.Body["virtual_keys"].(map[string]interface{})
budgetsMap := getBudgetsResp2.Body["budgets"].(map[string]interface{})

without verifying that the GET returned a successful status. A non‑200 will cause a panic, obscuring the real issue.

Add simple if resp.StatusCode != 200 { t.Fatalf(...) } checks after each relevant GET before indexing into Body.

Also applies to: 899-921, 1240-1276, 1357-1371

plugins/governance/advanced_scenarios_test.go (2)

14-528: Long‑running governance scenarios with t.Parallel() are likely to be slow/flaky in CI.

Several tests here (hierarchy switches, budget exhaustion and update flows) run up to ~150 chat completions each and are marked t.Parallel(). In aggregate, this can be very slow and intermittently fail under load/provider variance.

Consider either:

  • Removing t.Parallel() for these heaviest scenarios, or
  • Gating them behind an integration build tag / env flag so they only run in dedicated CI jobs.

Also applies to: 534-1098


1104-1259: VK deletion cascade still allows orphaned rate limits despite 1:1 invariants.

TestVKDeletionCascadeComplete asserts budgets are cascade‑deleted but still treats leftover rate limits as “acceptable”:

_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
if vkRateLimitStillExists || pcRateLimitStillExists {
    t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
}

Per the governance store invariants, budgets and rate limits have 1:1 relationships with their parents and shouldn’t be shared. When a VK (and its provider config) is deleted, their rate limits should be removed too, not orphaned.

Recommend turning this into a hard assertion (mirroring the budget checks) or, if there is a deliberate exception, documenting that design explicitly in both code and tests.

As per retrieved learnings, budgets and rate limits are 1:1 with parents and cascade deletion should not leave orphans.

plugins/governance/in_memory_sync_test.go (1)

97-105: Guard /budgets and /virtual-keys GETs with status checks before type assertions.

Several places in this file call:

budgetsMap2 := getBudgetsResp2.Body["budgets"].(map[string]interface{})
virtualKeysMap := getDataResp.Body["virtual_keys"].(map[string]interface{})

without first verifying StatusCode == 200. A transient non‑200 will panic instead of failing the test with a clear message.

For consistency with other tests in this suite, add explicit status checks after each MakeRequest to /api/governance/budgets?from_memory=true and /api/governance/virtual-keys?from_memory=true before indexing into Body.

Also applies to: 213-218, 325-330, 387-423

plugins/governance/config_update_sync_test.go (1)

48-67: Add HTTP status guards before indexing into response bodies in sync tests.

Across these tests, you directly assert on nested fields like:

vkData1 := getVKResp1.Body["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})
rateLimit1 := rateLimitsMap1[rateLimitID1].(map[string]interface{})
budget1 := budgetsMap1[budgetID].(map[string]interface{})

without checking StatusCode for the preceding GETs. If /api/governance/* ever returns a non‑200, the tests will panic and hide the underlying failure.

Recommend:

  • Checking resp.StatusCode == 200 (and logging resp.Body on failure) immediately after each GET before accessing Body.
  • Optionally factoring a small helper to safely extract nested maps/IDs with ok checks to avoid brittle chained assertions.

Also applies to: 97-115, 273-292, 322-331, 439-463

plugins/governance/test_utils.go (5)

90-141: Add t.Helper() and HTTP timeout to prevent test hangs and improve error attribution.

The bare http.Client{} has no timeout, which can cause tests to hang indefinitely if the server is unavailable. Additionally, missing t.Helper() causes test failures to be attributed to this helper instead of the calling test.

🔎 Proposed fix
 func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+	t.Helper()
-	client := &http.Client{}
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}

144-152: Remove deprecated rand.Seed call.

rand.Seed is deprecated since Go 1.20, and the global random source is automatically seeded. Since go.mod specifies Go 1.24.3, this call should be removed. Additionally, calling Seed on every invocation in parallel tests can cause race conditions.

🔎 Proposed fix
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"

247-268: ExtractIDFromResponse ignores keyPath parameter, creating a misleading API.

The function accepts keyPath but never uses it, only checking hardcoded wrappers ("virtual_key", "team", "customer"). This creates a misleading API where callers might think they can specify custom paths. Additionally, t.Helper() is missing.

🔎 Suggested fix - either implement keyPath or remove the parameter

Option 1: Remove the unused parameter

-func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+func ExtractIDFromResponse(t *testing.T, resp *APIResponse) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {

Option 2: Implement keyPath traversal

 func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {
 		t.Fatalf("Request failed with status %d: %v", resp.StatusCode, resp.Body)
 	}
 
+	// Honor keyPath if provided
+	if keyPath != "" {
+		cur := any(resp.Body)
+		for _, p := range strings.Split(keyPath, ".") {
+			m, ok := cur.(map[string]interface{})
+			if !ok {
+				t.Fatalf("Could not extract ID (non-object at %q): %v", p, resp.Body)
+			}
+			cur = m[p]
+		}
+		if id, ok := cur.(string); ok && id != "" {
+			return id
+		}
+	}
+
 	// Navigate through the response to find the ID

299-328: GlobalTestData is not thread-safe for parallel tests.

Since tests use t.Parallel(), multiple goroutines may call AddVirtualKey, AddTeam, or AddCustomer concurrently. Appending to slices without synchronization is a data race.

🔎 Proposed fix
+import "sync"
+
 // GlobalTestData stores IDs of created resources for cleanup
 type GlobalTestData struct {
+	mu          sync.Mutex
 	VirtualKeys []string
 	Teams       []string
 	Customers   []string
 }
 
 // AddVirtualKey adds a virtual key ID to the test data
 func (g *GlobalTestData) AddVirtualKey(id string) {
+	g.mu.Lock()
+	defer g.mu.Unlock()
 	g.VirtualKeys = append(g.VirtualKeys, id)
 }
 
 // AddTeam adds a team ID to the test data
 func (g *GlobalTestData) AddTeam(id string) {
+	g.mu.Lock()
+	defer g.mu.Unlock()
 	g.Teams = append(g.Teams, id)
 }
 
 // AddCustomer adds a customer ID to the test data
 func (g *GlobalTestData) AddCustomer(id string) {
+	g.mu.Lock()
+	defer g.mu.Unlock()
 	g.Customers = append(g.Customers, id)
 }

330-367: Cleanup should be protected by mutex and include t.Helper().

If Cleanup runs while another goroutine is adding IDs, there's a race condition. The method should acquire the mutex to snapshot the slices, then release it before performing deletions.

🔎 Proposed fix
 func (g *GlobalTestData) Cleanup(t *testing.T) {
+	t.Helper()
+	g.mu.Lock()
+	vks := make([]string, len(g.VirtualKeys))
+	copy(vks, g.VirtualKeys)
+	teams := make([]string, len(g.Teams))
+	copy(teams, g.Teams)
+	customers := make([]string, len(g.Customers))
+	copy(customers, g.Customers)
+	g.mu.Unlock()
+
 	// Delete virtual keys
-	for _, vkID := range g.VirtualKeys {
+	for _, vkID := range vks {
 		resp := MakeRequest(t, APIRequest{
 			Method: "DELETE",
 			Path:   fmt.Sprintf("/api/governance/virtual-keys/%s", vkID),
 		})
 		if resp.StatusCode >= 400 && resp.StatusCode != 404 {
 			t.Logf("Warning: failed to delete virtual key %s: status %d", vkID, resp.StatusCode)
 		}
 	}
 
 	// Delete teams
-	for _, teamID := range g.Teams {
+	for _, teamID := range teams {
 		resp := MakeRequest(t, APIRequest{
 			Method: "DELETE",
 			Path:   fmt.Sprintf("/api/governance/teams/%s", teamID),
 		})
 		if resp.StatusCode >= 400 && resp.StatusCode != 404 {
 			t.Logf("Warning: failed to delete team %s: status %d", teamID, resp.StatusCode)
 		}
 	}
 
 	// Delete customers
-	for _, customerID := range g.Customers {
+	for _, customerID := range customers {
 		resp := MakeRequest(t, APIRequest{
 			Method: "DELETE",
 			Path:   fmt.Sprintf("/api/governance/customers/%s", customerID),
 		})
 		if resp.StatusCode >= 400 && resp.StatusCode != 404 {
 			t.Logf("Warning: failed to delete customer %s: status %d", customerID, resp.StatusCode)
 		}
 	}
 
 	t.Logf("Cleanup completed: deleted %d VKs, %d teams, %d customers",
-		len(g.VirtualKeys), len(g.Teams), len(g.Customers))
+		len(vks), len(teams), len(customers))
 }
plugins/governance/rate_limit_test.go (2)

54-54: Fragile string conversion for request numbering breaks for i >= 10.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. This pattern appears at lines 54, 127, 206, 285, and 643. Since the loops iterate up to 10 times, this will produce incorrect results.

🔎 Proposed fix

Add "strconv" to the imports:

 import (
+	"strconv"
 	"testing"
 	"time"
 )

Then replace each occurrence:

-				Content: "Short test request " + string(rune('0'+i)) + " for token limit.",
+				Content: "Short test request " + strconv.Itoa(i) + " for token limit.",

Apply the same fix at lines 127, 206, 285, and 643.


363-374: Add StatusCode check before indexing response body in TestMultipleProvidersSeparateRateLimits.

The code directly indexes into getDataResp.Body without checking StatusCode. If the API returns a non-200 response, the type assertion will panic and obscure the root cause.

This same pattern appears in multiple tests throughout the file.

🔎 Proposed fix pattern
 	getDataResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})
+	if getDataResp.StatusCode != 200 {
+		t.Fatalf("Failed to get VKs from memory: status=%d body=%v", getDataResp.StatusCode, getDataResp.Body)
+	}
 
-	virtualKeysMap := getDataResp.Body["virtual_keys"].(map[string]interface{})
+	virtualKeysMap, ok := getDataResp.Body["virtual_keys"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("virtual_keys missing or wrong type in response: %v", getDataResp.Body)
+	}
-	vkData := virtualKeysMap[vkValue].(map[string]interface{})
+	vkData, ok := virtualKeysMap[vkValue].(map[string]interface{})
+	if !ok {
+		t.Fatalf("VK %s not found in response", vkValue)
+	}

Apply analogous checks at:

  • Lines 490-506 (TestRateLimitInMemorySync)
  • Lines 705-737 (TestRateLimitUsageTrackedInMemory)
  • Lines 872-890, 917-942 (TestProviderLevelRateLimitUsageTracking)
🧹 Nitpick comments (2)
plugins/governance/e2e_test.go (1)

1380-1425: Strengthen provider distribution and fallback assertions using extra_fields.

You now read extra_fields.provider / extra_fields.model_requested in TestWeightedProviderLoadBalancing, which is correct—these are the canonical fields for identifying which provider handled a request. However:

  • The weighted test only logs a warning when azureCount > 2; it doesn't assert and fail on distribution violations.
  • TestProviderFallbackMechanism doesn't use extra_fields.provider at all and never proves that OpenAI handled the gpt-4o requests.

To make these tests reliable regression guards:

  • In the weighted LB test, add an assertion that azureCount <= 2 (out of 10 requests, matching the 1% weight), and fail if all traffic routes to a single provider.
  • In the fallback test, check extra_fields.provider to assert that successful gpt-4o responses came from OpenAI only.

Also applies to: 1444-1542

plugins/governance/rate_limit_test.go (1)

527-527: Consider using WaitForCondition instead of fixed time.Sleep for more reliable tests.

The test uses time.Sleep(500 * time.Millisecond) to wait for async updates to propagate to the in-memory store. This same pattern appears at lines 760 and 915. Consider using the WaitForCondition or WaitForAPICondition helpers from test_utils.go for more robust polling with backoff.

This is optional, but would make tests more resilient to timing variations.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd49e9e and 4f8ebad.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
✅ Files skipped from review due to trivial changes (2)
  • plugins/governance/fixtures_test.go
  • plugins/governance/changelog.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/tracker_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/resolver_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/go.mod
  • plugins/governance/e2e_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/customer_budget_test.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/resolver_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/customer_budget_test.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/resolver_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/customer_budget_test.go
🧬 Code graph analysis (8)
plugins/governance/resolver_test.go (8)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
plugins/governance/resolver.go (11)
  • NewBudgetResolver (70-75)
  • DecisionAllow (18-18)
  • DecisionVirtualKeyNotFound (19-19)
  • DecisionVirtualKeyBlocked (20-20)
  • DecisionProviderBlocked (26-26)
  • DecisionModelBlocked (25-25)
  • DecisionTokenLimited (23-23)
  • DecisionRequestLimited (24-24)
  • DecisionBudgetExceeded (22-22)
  • DecisionRateLimited (21-21)
  • Decision (15-15)
core/schemas/context.go (1)
  • BifrostContext (31-41)
ui/lib/types/governance.ts (5)
  • VirtualKey (61-80)
  • RateLimit (13-25)
  • Budget (5-11)
  • Customer (37-44)
  • Team (27-35)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (117-117)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (3)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • ExtractIDFromResponse (248-268)
plugins/governance/vk_budget_test.go (1)
plugins/governance/test_utils.go (7)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/config_update_sync_test.go (1)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
  • WaitForCondition (371-394)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/usage_tracking_test.go (1)
plugins/governance/test_utils.go (8)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (12)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/test_utils.go (1)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
plugins/governance/customer_budget_test.go (3)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • CreateTeamRequest (183-187)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (11)
plugins/governance/go.mod (1)

10-10: LGTM!

Adding testify as a direct dependency is appropriate given the extensive test suite being added in this PR.

plugins/governance/edge_cases_test.go (1)

1-14: LGTM!

Test setup with t.Parallel(), NewGlobalTestData(), and deferred cleanup follows the established pattern in the governance test suite.

plugins/governance/vk_budget_test.go (1)

1-12: LGTM!

Test structure follows the governance test suite patterns with parallel execution and proper cleanup.

plugins/governance/provider_budget_test.go (1)

1-54: LGTM!

Test setup correctly creates a VK with separate provider budgets for OpenAI and Anthropic, enabling isolated testing of per-provider budget enforcement.

plugins/governance/resolver_test.go (6)

1-37: LGTM!

Clean test setup for the happy path scenario. Uses helper builders effectively and validates both the decision and that the virtual key was found.


143-168: LGTM!

Good coverage of token limit enforcement with the rate limit already at max. The assertRateLimitInfo helper validates that rate limit metadata is properly populated.


196-237: LGTM!

Excellent test for rate limit expiration handling. The test correctly simulates ticker behavior by calling ResetExpiredRateLimitsInMemory and ResetExpiredRateLimits before evaluating, ensuring the reset logic is exercised.


296-346: Good coverage of multi-level budget hierarchy.

The test validates budget checking across VK → Team → Customer hierarchy. The logic correctly tests both the under-limit and over-limit scenarios.


403-511: LGTM!

Well-structured table-driven tests for isProviderAllowed and isModelAllowed. Good edge case coverage including empty allowlists and nil provider configs.


513-551: LGTM!

Validates that context is correctly populated with governance metadata (VK ID, team ID, customer ID). This ensures downstream handlers have access to governance context.

plugins/governance/customer_budget_test.go (1)

8-335: Customer budget multi‑VK / multi‑team tests look logically sound.

Both tests correctly:

  • Share a small customer budget across multiple VKs/teams,
  • Track actual token usage via usage and CalculateCost,
  • Require at least one successful request before budget rejection, and
  • Fail if no budget‑related rejection is ever observed.

Loops are bounded (≤50 requests), so this should remain manageable assuming a local test backend.

@danpiths danpiths changed the base branch from 12-08-feat_governance_plugin_refactors to graphite-base/1041 December 23, 2025 08:00
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-09-chore_governance_tests_added branch from 4f8ebad to 2c958eb Compare December 23, 2025 14:42
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (30)
plugins/governance/resolver_test.go (1)

332-339: Add a test helper method instead of directly manipulating internal state.

Line 338 bypasses encapsulation by directly calling store.budgets.Store(), coupling the test to implementation details. Consider adding a test helper method to the store:

🔎 Proposed approach
// In LocalGovernanceStore - test-only helper
func (gs *LocalGovernanceStore) SetBudgetCurrentUsageForTest(budgetID string, usage float64) error {
	if val, ok := gs.budgets.Load(budgetID); ok {
		budget := val.(*configstoreTables.TableBudget)
		budget.CurrentUsage = usage
		gs.budgets.Store(budgetID, budget)
		return nil
	}
	return fmt.Errorf("budget not found: %s", budgetID)
}

Then in the test:

-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)
plugins/governance/go.mod (1)

24-25: Remove duplicate dependency entry.

The github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 dependency appears twice. Run go mod tidy to clean up the module file.

cd plugins/governance && go mod tidy
plugins/governance/edge_cases_test.go (2)

60-77: VK budget is more restrictive than provider budget - test expectation may be incorrect.

The test comment on Line 101 states "Provider budget should be the limiting factor (most restrictive)", but:

  • VK budget: $0.01 (Line 61)
  • Provider budget: $0.08 (Line 77)

The VK-level budget ($0.01) is actually more restrictive. Either adjust the budget values or update the test expectation.


105-183: Remove shouldStop flag - causes early loop exit before budget rejection is observed.

The shouldStop logic (lines 105, 177-183) causes the loop to exit after crossing the threshold, without sending the request that would trigger the server-side budget-enforcement error response. Remove this pattern to let the API enforcement terminate the loop:

🔎 Proposed fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
-		}
 	}
plugins/governance/provider_budget_test.go (2)

63-64: Remove shouldStop flag in OpenAI subtest.

Same pattern issue as other budget tests: the shouldStop logic (lines 63, 134-140) causes early loop exit before the budget-exceeded request is sent.

🔎 Proposed fix
 		providerBudget := 0.01
 		consumedBudget := 0.0
 		requestNum := 1
 		var lastSuccessfulCost float64
-		var shouldStop = false

 		for requestNum <= 50 {
 			// ... request logic ...

 			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
 		}

153-154: Remove shouldStop flag in Anthropic subtest.

Apply the same fix as the OpenAI subtest - remove the shouldStop logic (lines 153, 224-230) to let the loop continue until the API returns a budget rejection.

plugins/governance/usage_tracking_test.go (4)

126-128: Add assertion to verify rate limit actually reset.

The test waits 35 seconds for the reset but only logs "Rate limit reset should have occurred" without verifying the actual values. There's no assertion that token_current_usage decreased to 0.

🔎 Proposed fix
// After getting getDataResp3
virtualKeysMap3 := getDataResp3.Body["virtual_keys"].(map[string]interface{})
vkData3 := virtualKeysMap3[vkValue].(map[string]interface{})

getRateLimitsResp := MakeRequest(t, APIRequest{
    Method: "GET",
    Path:   "/api/governance/rate-limits?from_memory=true",
})

if getRateLimitsResp.StatusCode == 200 {
    rateLimitsMap := getRateLimitsResp.Body["rate_limits"].(map[string]interface{})
    if rateLimit, ok := rateLimitsMap[rateLimitID].(map[string]interface{}); ok {
        tokenUsage, _ := rateLimit["token_current_usage"].(float64)
        if tokenUsage > 0 {
            t.Fatalf("Rate limit not reset: token_current_usage is %.0f (expected 0)", tokenUsage)
        }
        t.Logf("Rate limit correctly reset to 0 ✓")
    }
}

398-400: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in range 0-9. For i >= 10, this produces unexpected characters.

🔎 Proposed fix
-			Content: "Test request " + string(rune('0'+i)) + " for reset ticker test.",
+			Content: "Test request " + strconv.Itoa(i) + " for reset ticker test.",

Add "strconv" to the imports.


413-426: Missing status checks before type assertions.

getVKResp.Body and getBudgetsResp.Body are accessed without verifying StatusCode == 200, which could cause panics on non-200 responses.

🔎 Proposed fix
 	getVKResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})
+	if getVKResp.StatusCode != 200 {
+		t.Fatalf("Failed to get virtual keys: status %d", getVKResp.StatusCode)
+	}

 	virtualKeysMap := getVKResp.Body["virtual_keys"].(map[string]interface{})

 	getBudgetsResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})
+	if getBudgetsResp.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d", getBudgetsResp.StatusCode)
+	}

 	budgetsMap := getBudgetsResp.Body["budgets"].(map[string]interface{})

521-523: Fragile string conversion for request numbering.

Same issue as line 399 - string(rune('0'+i)) breaks for i >= 10.

-			Content: "Persistence test request " + string(rune('0'+i)) + ".",
+			Content: "Persistence test request " + strconv.Itoa(i) + ".",
plugins/governance/in_memory_sync_test.go (5)

99-104: Missing status code check for budgets endpoint.

getBudgetsResp2.Body["budgets"] is accessed without verifying the response status.

 	getBudgetsResp2 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})
+	if getBudgetsResp2.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d, body: %v", getBudgetsResp2.StatusCode, getBudgetsResp2.Body)
+	}

 	budgetsMap2 := getBudgetsResp2.Body["budgets"].(map[string]interface{})

213-218: Missing status code check for budgets endpoint in team update test.

Same issue as line 104 - add status check before accessing body.


380-381: Potential double-delete on cleanup.

The VK is added to testData at line 381, then manually deleted at lines 402-409. When testData.Cleanup(t) runs via defer, it may attempt to delete the already-deleted VK, potentially causing test noise or failures.

Either remove from testData after manual deletion, or skip adding it since manual deletion is the test's purpose:

 	vkID := ExtractIDFromResponse(t, createVKResp, "id")
-	testData.AddVirtualKey(vkID)
+	// Don't add to testData since we manually delete as part of the test

387-392: Missing status code check before accessing body.

Add status check before type-asserting getDataResp.Body["virtual_keys"]:

 	getDataResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})
+	if getDataResp.StatusCode != 200 {
+		t.Fatalf("Failed to get VKs: status %d, body: %v", getDataResp.StatusCode, getDataResp.Body)
+	}

 	virtualKeysMap := getDataResp.Body["virtual_keys"].(map[string]interface{})

416-421: Missing status code check after VK deletion.

Same issue - add status check before accessing body:

 	getDataResp2 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})
+	if getDataResp2.StatusCode != 200 {
+		t.Fatalf("Failed to get VKs after delete: status %d, body: %v", getDataResp2.StatusCode, getDataResp2.Body)
+	}

 	virtualKeysMap2 := getDataResp2.Body["virtual_keys"].(map[string]interface{})
plugins/governance/customer_budget_test.go (2)

70-156: Remove shouldStop flag to ensure customer budget enforcement is tested.

Same issue as other budget tests: the shouldStop logic (lines 70, 149-155) causes the loop to exit before sending the request that would trigger server-side budget rejection.

🔎 Proposed fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false
 	vkIndex := 0

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
 		vkIndex++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= customerBudget {
-			shouldStop = true
-		}
 	}

245-330: Same shouldStop issue in second test function.

TestCustomerBudgetExceededWithMultipleTeams has the same early-exit problem at lines 245, 324-330. Apply the same fix to remove the shouldStop logic.

plugins/governance/advanced_scenarios_test.go (3)

1148-1152: Avoid manual-only deletion; always register resources in GlobalTestData.

These tests intentionally skip testData.Add* (“we’ll delete manually”). If a test fails before reaching the manual DELETE, those entities will leak and can interfere with later tests.

Given Cleanup already tolerates 404s, it’s safe to both register resources in GlobalTestData and delete them explicitly in the scenario when needed. That keeps cleanup fail‑safe without changing behavior.

Also applies to: 1293-1295, 1397-1399, 1501-1503, 1599-1601


88-133: Assert that budget exhaustion was actually observed before switching hierarchies.

In these three scenarios the loops break either on a budget error or just because they’ve run enough requests / crossed a cost threshold, but after the loop you don’t assert that a “budget” rejection was ever seen. That allows “switch after exhaustion” tests to pass even if enforcement is broken.

Track a sawBudgetRejection boolean set only when CheckErrorMessage(..., "budget") returns true, and after the loop t.Fatalf if it’s still false. You already use this pattern in the *AfterExhaustion tests; reusing it here will make these switch scenarios real regression guards.

Also applies to: 253-297, 455-491


1246-1259: Make VK deletion cascade strict for rate limits (no orphaned rate limits).

TestVKDeletionCascadeComplete asserts budgets are removed but still treats leftover rate limits as “acceptable” with a log. Given the documented 1:1 parent→rate‑limit relationship, leaving rate limits orphaned contradicts the invariants you’re testing.

Mirror the budget check for rate limits: if either vkRateLimitStillExists or pcRateLimitStillExists is true after VK deletion, fail the test instead of logging.

plugins/governance/e2e_test.go (3)

292-303: Replace fragile rune-based numbering with proper integer formatting.

"Test hierarchy enforcement request " + string(rune('0'+requestNum%10)) only works for single digits and is unnecessarily clever.

Use fmt.Sprintf("Test hierarchy enforcement request %d", requestNum) (or strconv.Itoa) for clearer, robust numbering.


1380-1420: Harden provider detection in weighted LB test (avoid chained unsafe assertions).

TestWeightedProviderLoadBalancing pulls provider and model_requested via a chained assertion on resp.Body["extra_fields"] without checking that extra_fields is present and of the expected type. A malformed/changed response shape will panic.

Consider:

  • Asserting resp.StatusCode == 200 before inspecting the body.
  • Extracting extra_fields with comma‑ok into a local map[string]interface{}.
  • Then doing provider, _ := extra["provider"].(string) and model, _ := extra["model_requested"].(string).

That keeps the test from panicking and yields clearer failures if the response shape changes.


1491-1527: TestProviderFallbackMechanism isn’t asserting actual fallback behavior yet.

This test currently just counts successful responses and logs “likely via OpenAI fallback” based on the top-level model field. It never verifies:

  • That the handling provider is indeed OpenAI (not Anthropic), or
  • That all successful responses use the fallback.

You already have provider info in extra_fields.provider and extra_fields.model_requested (as used in the weighted LB test). To make this a real guardrail:

  • Assert each successful response’s extra_fields.provider == "openai".
  • Optionally fail if any request reports provider == "anthropic" for model_requested == "gpt-4o".
  • Keep the success count check so failures surface clearly.

That will actually validate the fallback routing rather than just surface-level success.

plugins/governance/test_utils.go (3)

247-268: Either honor keyPath in ExtractIDFromResponse or drop the parameter.

ExtractIDFromResponse accepts keyPath but then ignores it and only searches fixed wrappers (virtual_key, team, customer). That’s a misleading API for callers who think they’re specifying a path.

Given all current usage seems to rely on the default wrappers, the simplest fix is to remove the unused parameter from the signature and call sites. Alternatively, implement basic keyPath traversal (e.g., splitting on . and walking nested maps) and fall back to the current wrapper list when keyPath is empty.


90-141: Add t.Helper() and an HTTP client timeout to MakeRequest.

MakeRequest currently:

  • Doesn’t call t.Helper(), so failures are reported from inside the helper, not the calling test.
  • Uses a bare http.Client{} with no timeout, so if localhost:8080 is slow or unavailable, tests can hang indefinitely.

Wrapping it as a proper test helper makes failures clearer and prevents hangs:

Proposed adjustment
-func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+	t.Helper()
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}

143-152: Remove deprecated, racy rand.Seed from generateRandomID.

generateRandomID reseeds the global math/rand source on each call using rand.Seed(time.Now().UnixNano()). With t.Parallel() tests, this:

  • Races on the global RNG state.
  • Uses an API that’s deprecated in modern Go.

Since IDs are only for tests, you can safely drop the explicit seeding and rely on the default source, or use a local rand.New(rand.NewSource(...)) if you really want per-call seeding.

At minimum:

 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }
plugins/governance/rate_limit_test.go (3)

45-55: Replace string(rune('0'+i)) with proper integer→string conversion.

These tests use string(rune('0'+i)) to number requests, which only behaves as expected for i in [0,9] and is unnecessarily opaque.

Use strconv.Itoa(i) or fmt.Sprintf("%d", i) instead; e.g.:

Example change
-                        Content: "Short test request " + string(rune('0'+i)) + " for token limit.",
+                        Content: fmt.Sprintf("Short test request %d for token limit.", i),

Apply the same pattern to the other numbered request messages.

Also applies to: 118-128, 197-207, 277-286, 635-644


43-79: Ensure rate‑limit tests fail when enforcement is never observed.

TestVirtualKeyTokenRateLimit, TestVirtualKeyRequestRateLimit, TestProviderConfigTokenRateLimit, and TestProviderConfigRequestRateLimit all:

  • Loop over a fixed number of requests.
  • Treat a budget/rate error as success and return.
  • But if no request is ever rejected, they just log or t.Skip instead of failing.

That means all four “RateLimit” tests can pass even if rate limiting is completely broken. Track a sawRateLimitRejection boolean and, after the loop, t.Fatalf if it’s still false (and you managed at least one 200). This is the same pattern you’re already using in several budget tests.

Also applies to: 116-152, 195-231, 274-310


360-377: Add StatusCode and shape checks before indexing in-memory maps.

These rate‑limit sync/usage tests call GET endpoints (e.g., /api/governance/virtual-keys?from_memory=true, /api/governance/rate-limits?from_memory=true) and immediately do chained assertions like:

virtualKeysMap := resp.Body["virtual_keys"].(map[string]interface{})
vkData := virtualKeysMap[vkValue].(map[string]interface{})

If the endpoint returns non‑200 or a different error shape, the test will panic rather than explain the failure.

Mirror the stricter pattern you use elsewhere:

  • Assert resp.StatusCode == 200 and t.Fatalf with status+body otherwise.
  • Use comma‑ok when casting virtual_keys, rate_limits, and per‑ID entries to map[string]interface{} and fail with a clear message if anything is missing.

This will turn subtle panics into actionable test failures.

Also applies to: 490-506, 705-713, 718-737, 875-882, 917-942

plugins/governance/config_update_sync_test.go (1)

48-66: Guard /from_memory=true reads with StatusCode and safe shape checks.

All these blocks assume 200 responses and use chained type assertions like getVKResp.Body["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{}) and rateLimitsMap[id].(map[string]interface{}) without checking StatusCode or ok. If the API returns 4xx/5xx or a slightly different shape, the tests will panic instead of failing with a clear message.

Consider a small helper that:

  • Asserts resp.StatusCode == 200 and fails with the body on error.
  • Safely extracts virtual_keys, budgets, rate_limits, and child maps using comma‑ok with t.Fatalf on mismatch.

Apply it to these GETs to make failures deterministic instead of panics.

Also applies to: 97-115, 151-169, 273-291, 323-332, 439-457, 490-507, 549-576, 592-603, 641-660, 674-708, 746-772, 790-819, 845-877, 895-938, 960-1007, 1021-1055, 1087-1120

🧹 Nitpick comments (1)
plugins/governance/config_update_sync_test.go (1)

338-351: Use usage-derived “lower than current” limits to avoid brittle reset tests.

Places where you hard‑code very small new limits (e.g., newLowerBudget := 0.001) rely on the consumed usage being “large enough” to guarantee new limit < current usage. If a model/pricing change ever yields tiny costs, these tests could fail even though behavior is correct.

You already use a safer pattern in TestVKBudgetUpdateSyncToMemory (newLowerBudget := usageBeforeUpdate * 0.5). Mirroring that everywhere you want “new limit < current usage” will make the reset semantics tests more robust.

Also applies to: 710-720, 905-916, 1068-1082

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f8ebad and 2c958eb.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/governance/vk_budget_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/rate_limit_enforcement_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/customer_budget_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/go.mod
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/changelog.md
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/customer_budget_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/customer_budget_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/test_utils.go
🧬 Code graph analysis (6)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (10)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/usage_tracking_test.go (2)
plugins/governance/test_utils.go (8)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
plugins/governance/e2e_test.go (5)
plugins/governance/test_utils.go (16)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
  • CreateRateLimitRequest (202-207)
  • WaitForAPICondition (398-424)
  • APIResponse (84-88)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (1)
  • Role (17-17)
framework/configstore/tables/utils.go (1)
  • ParseDuration (9-43)
plugins/governance/advanced_scenarios_test.go (1)
plugins/governance/test_utils.go (17)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
  • UpdateVirtualKeyRequest (210-218)
  • CreateCustomerRequest (190-193)
  • UpdateBudgetRequest (196-199)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
  • ProviderConfigRequest (167-174)
  • CreateRateLimitRequest (202-207)
plugins/governance/rate_limit_test.go (2)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (3)
plugins/governance/changelog.md (1)

1-3: LGTM!

The changelog entries are clear and correctly describe the refactors and test additions for the governance plugin.

plugins/governance/go.mod (1)

10-10: LGTM - testify dependency added for test assertions.

The github.com/stretchr/testify v1.11.1 dependency is appropriate for the comprehensive test suite being added.

plugins/governance/resolver_test.go (1)

1-37: Good test structure for BudgetResolver.

The test suite is well-organized with clear test cases covering happy paths, edge cases, and various blocking conditions (VK not found, inactive, provider/model filtering, rate limits, budgets).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (18)
plugins/governance/resolver_test.go (1)

332-346: Consider adding a test helper method instead of directly manipulating internal state.

Line 338 bypasses encapsulation by directly calling store.budgets.Store(), coupling the test to implementation details.

🔎 Proposed fix

Add a test helper method to LocalGovernanceStore:

// SetBudgetCurrentUsageForTest is a test-only helper to set budget usage
func (gs *LocalGovernanceStore) SetBudgetCurrentUsageForTest(budgetID string, usage float64) error {
    if val, ok := gs.budgets.Load(budgetID); ok {
        budget := val.(*configstoreTables.TableBudget)
        budget.CurrentUsage = usage
        gs.budgets.Store(budgetID, budget)
        return nil
    }
    return fmt.Errorf("budget not found: %s", budgetID)
}

Then update the test:

-	governanceData := store.GetGovernanceData()
-	vkBudgetToUpdate := governanceData.Budgets["vk-budget"]
-	if vkBudgetToUpdate != nil {
-		vkBudgetToUpdate.CurrentUsage = 100.0
-		store.budgets.Store("vk-budget", vkBudgetToUpdate)
-	}
+	err = store.SetBudgetCurrentUsageForTest("vk-budget", 100.0)
+	require.NoError(t, err)
plugins/governance/go.mod (1)

24-25: Remove duplicate dependency entry.

The github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 dependency appears twice in the indirect requirements list. Run go mod tidy to clean up the module file.

🔎 Suggested fix
 	github.com/aws/aws-sdk-go-v2/internal/ini v1.8.4 // indirect
 	github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 // indirect
 	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.4 // indirect

Then run:

cd plugins/governance && go mod tidy
plugins/governance/edge_cases_test.go (2)

60-78: VK budget is more restrictive than provider budget - test expectation is incorrect.

The test comment on Line 101 states "Provider budget should be the limiting factor (most restrictive)", but:

  • VK budget: $0.01 (Line 61)
  • Provider budget: $0.08 (Line 77)

The VK-level budget ($0.01) is actually more restrictive than the provider-level budget ($0.08). The VK budget should be exhausted first, not the provider budget.

🔎 Proposed fix - make provider budget the most restrictive
 	// Create a VK with even tighter budget and provider-specific budgets
-	vkBudget := 0.01
+	vkBudget := 0.15
 	createVKResp := MakeRequest(t, APIRequest{

Or alternatively, make the provider budget truly the most restrictive:

 				Budget: &BudgetRequest{
-					MaxLimit:      0.08, // Even tighter provider budget
+					MaxLimit:      0.005, // Most restrictive - provider budget
 					ResetDuration: "1h",
 				},

105-184: Remove shouldStop flag - it can skip the actual rejection you're trying to validate.

The shouldStop logic (lines 105, 177-183) causes early loop exit before the budget-exceeded request is sent. This pattern exits the loop right after crossing the threshold, without observing the API's budget-enforcement error response.

🔎 Proposed fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= 0.08 { // Provider budget
-			shouldStop = true
-		}
 	}
plugins/governance/advanced_scenarios_test.go (1)

1254-1259: Inconsistent cascade deletion assertion for rate limits vs budgets.

The test correctly asserts that budgets are cascade-deleted (Lines 1249-1252 use t.Fatalf), but rate limits are still allowed to be orphaned with just a log message (Lines 1257-1259). This is inconsistent - if cascade deletion should remove budgets, it should likely remove rate limits too.

🔎 Proposed fix
 	// Rate limits should be gone
 	_, vkRateLimitStillExists := rateLimitsMap2[vkRateLimitID]
 	_, pcRateLimitStillExists := rateLimitsMap2[pcRateLimitID]
-	if vkRateLimitStillExists || pcRateLimitStillExists {
-		t.Logf("Note: Rate limits may still exist in memory (orphaned) - this is acceptable")
-	}
+	if vkRateLimitStillExists || pcRateLimitStillExists {
+		t.Fatalf("Rate limits should be cascade-deleted: VK rate limit exists=%v, PC rate limit exists=%v",
+			vkRateLimitStillExists, pcRateLimitStillExists)
+	}

Based on learnings, budgets and rate limits have 1:1 relationships with their parent entities and cascade deletion should apply to both.

plugins/governance/provider_budget_test.go (1)

63-64: Remove shouldStop flag - same loop termination issue as other budget tests.

The shouldStop flag causes early exit before the budget-exceeded request is sent. This pattern is present in both the OpenAI subtest (lines 63-64, 134-140) and Anthropic subtest (lines 153-154, 224-230).

🔎 Proposed fix for OpenAI subtest
 		providerBudget := 0.01
 		consumedBudget := 0.0
 		requestNum := 1
 		var lastSuccessfulCost float64
-		var shouldStop = false

 		for requestNum <= 50 {
 			// ... request logic ...

 			requestNum++
-
-			if shouldStop {
-				break
-			}
-
-			if consumedBudget >= providerBudget {
-				shouldStop = true
-			}
 		}

Apply the same fix to the Anthropic subtest (lines 153-154, 224-230).

plugins/governance/store_test.go (1)

149-154: Don't ignore NewLocalGovernanceStore and GetVirtualKey errors in tests.

Line 149 discards the error from NewLocalGovernanceStore, and line 154 discards the exists boolean from GetVirtualKey. If store construction fails or the VK doesn't exist, the test will panic or produce misleading results.

🔎 Proposed fix
-			testStore, _ := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
+			testStore, err := NewLocalGovernanceStore(context.Background(), logger, nil, &configstore.GovernanceConfig{
 				VirtualKeys: []configstoreTables.TableVirtualKey{*testVK},
 				Budgets:     []configstoreTables.TableBudget{*testBudget},
 			})
+			require.NoError(t, err)

-			testVK, _ = testStore.GetVirtualKey("sk-bf-test")
+			testVK, exists := testStore.GetVirtualKey("sk-bf-test")
+			require.True(t, exists, "VK should exist")
 			err := testStore.CheckBudget(context.Background(), testVK, &EvaluationRequest{Provider: schemas.OpenAI}, nil)
plugins/governance/in_memory_sync_test.go (3)

99-104: Missing status code check for budgets endpoint.

getBudgetsResp2.Body["budgets"] is accessed without verifying the response status. This can cause panics if the request fails.

🔎 Proposed fix
 	getBudgetsResp2 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/budgets?from_memory=true",
 	})
+	if getBudgetsResp2.StatusCode != 200 {
+		t.Fatalf("Failed to get budgets: status %d, body: %v", getBudgetsResp2.StatusCode, getBudgetsResp2.Body)
+	}

 	budgetsMap2 := getBudgetsResp2.Body["budgets"].(map[string]interface{})

The same pattern appears at lines 213-218 and 325-330. Apply status checks consistently before accessing response bodies.


386-397: Add status check and consider cleanup registration.

Two issues in this section:

  1. getDataResp.Body["virtual_keys"] is accessed without status verification (line 392)
  2. The VK is added to testData at line 381 then manually deleted at lines 402-409, potentially causing double-delete on cleanup
🔎 Proposed fix for status check
 	getDataResp := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})
+	if getDataResp.StatusCode != 200 {
+		t.Fatalf("Failed to get VKs: status %d, body: %v", getDataResp.StatusCode, getDataResp.Body)
+	}

 	virtualKeysMap := getDataResp.Body["virtual_keys"].(map[string]interface{})

For the double-delete, either remove the VK from testData after manual deletion, or don't add it since the test's purpose is to delete it manually (assuming Cleanup handles 404s gracefully).


416-421: Missing status code check after VK deletion.

Same issue - add a status check before accessing the body.

🔎 Proposed fix
 	getDataResp2 := MakeRequest(t, APIRequest{
 		Method: "GET",
 		Path:   "/api/governance/virtual-keys?from_memory=true",
 	})
+	if getDataResp2.StatusCode != 200 {
+		t.Fatalf("Failed to get VKs after delete: status %d, body: %v", getDataResp2.StatusCode, getDataResp2.Body)
+	}

 	virtualKeysMap2 := getDataResp2.Body["virtual_keys"].(map[string]interface{})
plugins/governance/fixtures_test.go (1)

3-12: Honor variadic format arguments in MockLogger methods

MockLogger.Error/Warn/Info/Debug/Fatal currently append the raw format string and ignore args, so callers using logger.Error("failed: %v", err) will never see the interpolated error.

Consider formatting when args are present:

Suggested change
-import (
-	"sync"
-	"testing"
-	"time"
-
-	"github.com/maximhq/bifrost/core/schemas"
-	configstoreTables "github.com/maximhq/bifrost/framework/configstore/tables"
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-)
+import (
+	"fmt"
+	"sync"
+	"testing"
+	"time"
+
+	"github.com/maximhq/bifrost/core/schemas"
+	configstoreTables "github.com/maximhq/bifrost/framework/configstore/tables"
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
@@
 func (ml *MockLogger) Error(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	msg := format
+	if len(args) > 0 {
+		msg = fmt.Sprintf(format, args...)
+	}
+	ml.errors = append(ml.errors, msg)
 }
@@
 func (ml *MockLogger) Warn(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.warnings = append(ml.warnings, format)
+	msg := format
+	if len(args) > 0 {
+		msg = fmt.Sprintf(format, args...)
+	}
+	ml.warnings = append(ml.warnings, msg)
 }
@@
 func (ml *MockLogger) Info(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.infos = append(ml.infos, format)
+	msg := format
+	if len(args) > 0 {
+		msg = fmt.Sprintf(format, args...)
+	}
+	ml.infos = append(ml.infos, msg)
 }
@@
 func (ml *MockLogger) Debug(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.debugs = append(ml.debugs, format)
+	msg := format
+	if len(args) > 0 {
+		msg = fmt.Sprintf(format, args...)
+	}
+	ml.debugs = append(ml.debugs, msg)
 }
@@
 func (ml *MockLogger) Fatal(format string, args ...interface{}) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	ml.errors = append(ml.errors, format)
+	msg := format
+	if len(args) > 0 {
+		msg = fmt.Sprintf(format, args...)
+	}
+	ml.errors = append(ml.errors, msg)
 }

Also applies to: 38-66

plugins/governance/rate_limit_enforcement_test.go (2)

478-491: Avoid fragile string(rune('0'+i)) integer-to-string conversion

Using string(rune('0'+i)) only works for single digits and is easy to break if the loop bound changes. It’s clearer and safer to use standard formatting:

Suggested change
-						Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+						Content: fmt.Sprintf("Request %d to test both limits.", i+1),

Ensure fmt is already imported in this file (it is at the top).


521-615: Strengthen TestRateLimitInMemoryUsageTracking to assert usage counters

This test is named to verify in‑memory usage tracking, but it currently only checks that a rate_limit_id exists and logs a message. If usage accounting regressed (e.g., counters stuck at 0), the test would still pass.

Given you already extract tokensUsed from the response earlier in the test, consider asserting that the corresponding rate‑limit entry reflects non‑zero usage, for example:

Illustrative enhancement
-	vkData := virtualKeysMap[vkValue].(map[string]interface{})
-	rateLimitID, _ := vkData["rate_limit_id"].(string)
-
-	if rateLimitID != "" {
-		t.Logf("Rate limit %s is configured and tracking usage ✓", rateLimitID)
-	} else {
-		t.Logf("Rate limit is configured ✓")
-	}
+	vkData := virtualKeysMap[vkValue].(map[string]interface{})
+	rateLimitID, _ := vkData["rate_limit_id"].(string)
+	if rateLimitID == "" {
+		t.Fatalf("VK missing rate_limit_id in in-memory store")
+	}
+
+	getRateLimitsResp := MakeRequest(t, APIRequest{
+		Method: "GET",
+		Path:   "/api/governance/rate-limits?from_memory=true",
+	})
+	if getRateLimitsResp.StatusCode != 200 {
+		t.Fatalf("Failed to get rate limits from memory: status=%d body=%v",
+			getRateLimitsResp.StatusCode, getRateLimitsResp.Body)
+	}
+	rateLimitsMap, ok := getRateLimitsResp.Body["rate_limits"].(map[string]interface{})
+	if !ok {
+		t.Fatalf("rate_limits missing or wrong type in response")
+	}
+	rl, ok := rateLimitsMap[rateLimitID].(map[string]interface{})
+	if !ok {
+		t.Fatalf("Rate limit %s not found in map", rateLimitID)
+	}
+	if tokenUsage, ok := rl["token_current_usage"].(float64); !ok || int(tokenUsage) <= 0 {
+		t.Fatalf("Token usage not tracked: expected > 0, got %v", tokenUsage)
+	}
+	t.Logf("Rate limit %s tracked token usage ✓", rateLimitID)
plugins/governance/test_utils.go (2)

143-152: Remove deprecated rand.Seed call in generateRandomID

generateRandomID reseeds the global math/rand source on every call:

rand.Seed(time.Now().UnixNano())

This is deprecated in modern Go versions and unnecessary for tests; the top‑level source is already seeded, and repeated reseeding can reduce randomness and add noise under parallel execution.

You can safely drop the Seed call:

Suggested simplification
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
 		b[i] = letters[rand.Intn(len(letters))]
 	}
 	return string(b)
 }

90-141: Add t.Helper() and an HTTP timeout to MakeRequest

MakeRequest currently:

  • Creates a bare http.Client{} with no timeout.
  • Does not call t.Helper().

If the local test server is unavailable or a request stalls, tests can hang indefinitely. And failures are reported from inside MakeRequest rather than the calling test.

Consider:

Suggested hardening for `MakeRequest`
-// MakeRequest makes an HTTP request to the Bifrost API
-func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-	client := &http.Client{}
+// MakeRequest makes an HTTP request to the Bifrost API
+func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+	t.Helper()
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}

This keeps tests from hanging on network issues and makes stack traces point at the actual test call sites.

plugins/governance/config_update_sync_test.go (1)

48-66: Guard GET responses with status and safe type checks before indexing

Across these tests, GET calls like /api/governance/virtual-keys?from_memory=true, /api/governance/rate-limits?from_memory=true, and /api/governance/budgets?from_memory=true are immediately indexed and type‑asserted:

  • No StatusCode check on the GETs themselves.
  • Direct assertions such as resp.Body["virtual_keys"].(map[string]interface{}) and rateLimitsMap[id].(map[string]interface{}).

If the backing API returns a non‑200 or a slightly different shape, the tests will panic and obscure the real failure instead of reporting a clear assertion error.

Consider a small helper pattern and apply it consistently in this file, e.g.:

Example pattern for safe access
 getVKResp1 := MakeRequest(t, APIRequest{
   Method: "GET",
   Path:   "/api/governance/virtual-keys?from_memory=true",
 })
+if getVKResp1.StatusCode != 200 {
+  t.Fatalf("Failed to get virtual keys from memory: status=%d body=%v",
+    getVKResp1.StatusCode, getVKResp1.Body)
+}
- vkData1 := getVKResp1.Body["virtual_keys"].(map[string]interface{})[vkValue].(map[string]interface{})
+vks, ok := getVKResp1.Body["virtual_keys"].(map[string]interface{})
+if !ok {
+  t.Fatalf("virtual_keys missing or wrong type in response: %v", getVKResp1.Body)
+}
+vkData1, ok := vks[vkValue].(map[string]interface{})
+if !ok {
+  t.Fatalf("VK %s not found in in-memory store", vkValue)
+}

Apply analogous checks for budgets and rate‑limits maps so failures surface as explicit test errors rather than panics.

Also applies to: 97-115, 151-169, 273-291, 323-371, 439-457, 503-561, 640-655, 686-701, 840-857, 890-899, 1001-1019, 1052-1061

plugins/governance/e2e_test.go (1)

1444-1543: TestProviderFallbackMechanism doesn’t verify which provider actually handled the requests

This test currently only:

  • Counts successful requests for model: "gpt-4o", and
  • Logs the raw model string from the response as “likely via OpenAI fallback”.

It never asserts that OpenAI actually served the traffic or that Anthropic did not, so the test can pass even if the fallback logic is broken, as long as some provider responds.

You already have a robust provider‑detection pattern in TestWeightedProviderLoadBalancing using extra_fields.provider and extra_fields.model_requested. Reusing that here would make this a real regression guard, e.g.:

Illustrative tightening of assertions
-		if resp.StatusCode == 200 {
-			successCount++
-
-			// Try to detect which provider actually handled it
-			model := ""
-			if m, ok := resp.Body["model"].(string); ok {
-				model = m
-			}
-
-			t.Logf("Request %d succeeded (model: %s) - likely via OpenAI fallback", i+1, model)
-		} else {
+		if resp.StatusCode == 200 {
+			successCount++
+
+			extra, ok := resp.Body["extra_fields"].(map[string]interface{})
+			if !ok {
+				t.Fatalf("Missing extra_fields in response for request %d: %v", i+1, resp.Body)
+			}
+			provider, ok := extra["provider"].(string)
+			if !ok {
+				t.Fatalf("Missing provider in extra_fields for request %d: %v", i+1, extra)
+			}
+			modelRequested, _ := extra["model_requested"].(string)
+
+			// All successful gpt-4o requests should be handled by OpenAI, not Anthropic.
+			if provider != "openai" {
+				t.Fatalf("Fallback failed: request %d served by %q (model_requested=%q), expected openai",
+					i+1, provider, modelRequested)
+			}
+			t.Logf("Request %d succeeded via OpenAI fallback (model_requested=%s)", i+1, modelRequested)
+		} else {
 			t.Logf("Request %d failed with status %d", i+1, resp.StatusCode)
 		}

You can then also assert that successCount == numRequests to ensure all fallback attempts succeed.

plugins/governance/rate_limit_test.go (1)

8-79: Rate‑limit tests should fail if no enforcement is observed (currently they can pass silently)

In these four tests:

  • TestVirtualKeyTokenRateLimit
  • TestVirtualKeyRequestRateLimit
  • TestProviderConfigTokenRateLimit
  • TestProviderConfigRequestRateLimit

the loop returns as soon as a rate‑limit error is seen, which is fine. But if all requests succeed and no rate‑limit rejection ever occurs, the tests:

  • Only log Made %d successful requests... and
  • Never call t.Fail/t.Fatalf (or even assert on successCount),

so they can pass without actually verifying enforcement (despite their names and comments).

To make them real guards, the post‑loop path should fail when no limit is hit. For example, in TestVirtualKeyTokenRateLimit:

Example change for one test (apply similar pattern to the others)
 	for i := 0; i < 10; i++ {
 		// ... request logic ...
 	}
 
-	if successCount > 0 {
-		t.Logf("Made %d successful requests before hitting token limit ✓", successCount)
-	} else {
-		t.Skip("Could not make requests to test token limit")
-	}
+	if successCount == 0 {
+		t.Skip("Could not make any successful requests to exercise token limit")
+	}
+	t.Fatalf("Made %d successful requests but never observed a token rate-limit rejection", successCount)

You can adapt the message and thresholds per test (e.g., include configured limit values) but the key point is: if the loop completes without seeing a limit error, the test should fail, not quietly succeed.

(Separately, the string(rune('0'+i)) patterns used in request content generation are a bit brittle; using fmt.Sprintf("... %d ...", i+1) or strconv.Itoa would be clearer if you touch these tests again.)

Also applies to: 81-152, 154-231, 233-310

🧹 Nitpick comments (4)
plugins/governance/advanced_scenarios_test.go (2)

131-133: Add assertion to verify budget exhaustion was actually observed.

The loop may complete all 150 iterations without encountering a budget rejection (e.g., if pricing changed or budget enforcement is broken). Using only consumedBudget < team1Budget doesn't confirm that a rejection was seen.

🔎 Proposed fix
+	sawBudgetRejection := false
 	for requestNum <= 150 {
 		resp := MakeRequest(t, APIRequest{
 			// ... existing config ...
 		})

 		if resp.StatusCode >= 400 {
 			if CheckErrorMessage(t, resp, "budget") {
+				sawBudgetRejection = true
 				t.Logf("Team1 budget exhausted at request %d (consumed: $%.6f)", requestNum, consumedBudget)
 				break
 			} else {
 				t.Fatalf("Request %d failed with unexpected error: %v", requestNum, resp.Body)
 			}
 		}
 		// ... existing token tracking ...
 	}

-	if consumedBudget < team1Budget {
-		t.Fatalf("Could not exhaust Team1 budget")
+	if !sawBudgetRejection {
+		t.Fatalf("Did not observe budget rejection after %d requests (consumed: $%.6f)", requestNum-1, consumedBudget)
 	}

1148-1149: Consider registering VK with testData for fail-safe cleanup.

The comment "Don't add to testData since we'll delete manually" at line 1149 means if the test fails or panics before reaching the deletion code, this resource will leak. While past comments noted this, the pattern remains. Since Cleanup handles 404 responses gracefully, double-deletion is safe.

 	vkID := ExtractIDFromResponse(t, createVKResp, "id")
-	// Don't add to testData since we'll delete manually
+	testData.AddVirtualKey(vkID)  // Safe - Cleanup handles 404 gracefully
plugins/governance/e2e_test.go (1)

1200-1307: Clarify expected behavior for requests sent with a deleted VK

TestVKDeletionCascade correctly verifies that the VK disappears from the in‑memory store after deletion, but the final request using the deleted VK only logs whether it was accepted or rejected and treats both cases as success.

If the product expectation is that deleted VKs must no longer be usable (which is a common semantics for API keys), it would be better to assert that this last request fails with an appropriate status/error and fail the test otherwise. If the behavior is intentionally undefined (e.g., due to client‑side caching), consider updating the test name/comment to make that explicit and avoid the “✓” log suggesting a hard guarantee.

plugins/governance/test_utils.go (1)

247-268: Optional: make helpers more “test‑helper‑like” and less misleading

Two minor polish points:

  • ExtractIDFromResponse ignores its keyPath parameter and always searches hard‑coded wrappers (virtual_key, team, customer). Either implement keyPath traversal or remove the parameter to avoid a misleading API.
  • WaitForCondition / WaitForAPICondition don’t call t.Helper(), so failures are reported from inside these helpers rather than the caller.

If you want to tighten these up without changing behavior:

Example minimal tweaks
 func ExtractIDFromResponse(t *testing.T, resp *APIResponse, keyPath string) string {
+	t.Helper()
 	if resp.StatusCode >= 400 {
 		t.Fatalf("Request failed with status %d: %v", resp.StatusCode, resp.Body)
 	}
@@
 func WaitForCondition(t *testing.T, checkFunc func() bool, timeout time.Duration, description string) bool {
+	t.Helper()
@@
 func WaitForAPICondition(t *testing.T, req APIRequest, condition func(*APIResponse) bool, timeout time.Duration, description string) (*APIResponse, bool) {
+	t.Helper()

Also applies to: 369-424

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f8ebad and 2c958eb.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/governance/vk_budget_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/tracker_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/go.mod
  • plugins/governance/edge_cases_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/changelog.md
  • plugins/governance/fixtures_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/test_utils.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/test_utils.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/test_utils.go
🧬 Code graph analysis (6)
plugins/governance/provider_budget_test.go (5)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ProviderConfigRequest (167-174)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/config_update_sync_test.go (3)
plugins/governance/test_utils.go (15)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • WaitForCondition (371-394)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/rate_limit_enforcement_test.go (4)
plugins/governance/test_utils.go (10)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/rate_limit_test.go (1)
plugins/governance/test_utils.go (9)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
plugins/governance/test_utils.go (1)
core/schemas/models.go (1)
  • Model (109-129)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (6)
plugins/governance/changelog.md (1)

1-3: Changelog entries look good.

The entries accurately reflect the PR changes: store interface extraction for extensibility, rate limit handling improvements, and the new e2e test suite.

plugins/governance/resolver_test.go (1)

1-37: Well-structured resolver tests with good coverage.

The test file provides comprehensive coverage of the BudgetResolver including happy paths, error cases (VK not found, blocked, provider/model filtering), rate limits, and budget hierarchy validation. The table-driven tests for IsProviderAllowed and IsModelAllowed are particularly clean.

plugins/governance/store_test.go (1)

68-101: Good concurrent reads test.

The test properly validates lock-free concurrent access with 100 goroutines performing 10,000 total reads. Using atomic.Int64 for counters and sync.WaitGroup for synchronization is correct.

plugins/governance/in_memory_sync_test.go (1)

431-554: Good data consistency test.

TestDataEndpointConsistency properly validates that governance endpoints return consistent data across VKs, teams, and customers. The tolerance for SQLite locking in parallel tests (line 549-550) is reasonable.

plugins/governance/go.mod (1)

10-10: testify dependency added for test assertions.

The testify library is a standard choice for Go testing assertions, and v1.11.1 is the latest released version.

plugins/governance/customer_budget_test.go (1)

66-156: Customer budget exhaustion loops correctly exercise POST‑HOC enforcement

The shouldStop pattern here still sends at least one additional request after the locally tracked budget is exceeded and only treats the test as passed once a budget‑related rejection is observed; if no such rejection occurs within the bounded loop, the test fails with a clear Fatalf. The structure looks sound for validating customer‑level budget enforcement across multiple VKs/teams.

Also applies to: 241-331

@danpiths danpiths force-pushed the 12-09-chore_governance_tests_added branch from 4f8ebad to 0753937 Compare December 24, 2025 05:08
@danpiths danpiths changed the base branch from graphite-base/1041 to 12-08-feat_governance_plugin_refactors December 24, 2025 05:08
@danpiths danpiths changed the base branch from 12-08-feat_governance_plugin_refactors to graphite-base/1041 December 24, 2025 05:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (37)
plugins/governance/go.mod (1)

24-25: Duplicate dependency entry still present.

The github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 appears twice. Run go mod tidy to clean up.

plugins/governance/store_test.go (6)

116-116: Unverified exists return from GetVirtualKey.

The second return value (exists) is discarded. If the VK doesn't exist, subsequent code will operate on a nil pointer.


149-152: Don't ignore NewLocalGovernanceStore errors.

Line 149 discards the error. If store construction fails, the test will proceed with a nil/invalid store and may panic.


154-154: Unverified exists return from GetVirtualKey.

Same pattern as line 116 - exists is discarded.


192-192: Unverified exists return from GetVirtualKey.

Same pattern - add require.True(t, exists) before using vk.


284-286: Unverified exists return from GetVirtualKey.

Same issue at line 284.


319-321: Unverified exists return from GetVirtualKey.

Same pattern at line 319.

plugins/governance/usage_tracking_test.go (5)

398-400: Fragile string conversion for request numbering.

string(rune('0'+i)) only produces correct digit characters for i in 0-9. While the current loop only iterates 0-2, this pattern is fragile and should use strconv.Itoa(i) or fmt.Sprintf.


520-523: Same fragile string conversion pattern.

Same issue as line 399 - use strconv.Itoa(i) instead of string(rune('0'+i)).


112-128: Rate limit reset test lacks assertion on actual reset behavior.

The test waits 35 seconds but only logs "Rate limit reset should have occurred" without verifying that token_current_usage actually decreased to 0. Add an assertion after fetching getDataResp3.


166-178: Missing status checks before type assertions.

getVKResp and getBudgetsResp are accessed without verifying StatusCode == 200 first, which could cause panics on non-200 responses.


413-426: Missing status checks before type assertions.

Same issue - getVKResp and getBudgetsResp lack status checks before accessing body fields.

plugins/governance/in_memory_sync_test.go (5)

380-409: Potential double-delete on cleanup.

The VK is added to testData at line 381, then manually deleted at lines 402-409. When testData.Cleanup(t) runs via defer, it may attempt to delete the already-deleted VK. Either skip adding to testData or remove the VK from testData after manual deletion.


99-104: Missing status code check for budgets endpoint.

getBudgetsResp2.Body["budgets"] is accessed without verifying the response status first.


213-218: Missing status code check for budgets endpoint.

Same issue - add status check before accessing getBudgetsResp2.Body.


387-397: Missing status code check before type assertion.

getDataResp.Body["virtual_keys"] is accessed without checking StatusCode == 200.


416-421: Missing status code check after VK deletion.

Same pattern - add status check before accessing getDataResp2.Body.

plugins/governance/resolver_test.go (1)

332-346: Direct manipulation of internal store state breaks encapsulation.

Line 338 directly accesses store.budgets.Store(), coupling the test to implementation details. Consider adding a test helper method like SetBudgetCurrentUsageForTest(budgetID string, usage float64) to the store.

plugins/governance/customer_budget_test.go (2)

70-156: Remove shouldStop flag to ensure customer budget enforcement is tested.

Same issue as other budget tests: the shouldStop logic (lines 70, 149-155) causes the loop to exit before sending the request that would trigger server-side budget rejection.


245-331: Same shouldStop issue in second test function.

TestCustomerBudgetExceededWithMultipleTeams has the same early-exit problem at lines 245, 324-330.

plugins/governance/config_update_sync_test.go (1)

48-66: Check HTTP status codes before indexing into response bodies.

Multiple chained type assertions (e.g., line 54) assume success. If the server errors, assertions will panic. Add status checks before accessing body fields.

plugins/governance/vk_budget_test.go (1)

46-127: shouldStop early-exit prevents testing server-side budget enforcement.

The shouldStop flag (lines 46, 120-126) causes the loop to break before sending the request that would trigger server-side rejection. When consumedBudget >= vkBudget, the flag is set, and the next iteration breaks—but the over-budget request is never sent. The test relies on client-side tracking rather than verifying the API's enforcement.

Remove the early-exit logic and let the loop continue until the server rejects an over-budget request (or the 50-request cap is reached):

🔎 Suggested fix
 	consumedBudget := 0.0
 	requestNum := 1
 	var lastSuccessfulCost float64
-
-	var shouldStop = false

 	for requestNum <= 50 {
 		// ... request logic ...

 		requestNum++
-
-		if shouldStop {
-			break
-		}
-
-		if consumedBudget >= vkBudget {
-			shouldStop = true
-		}
 	}
plugins/governance/advanced_scenarios_test.go (4)

14-171: Heavy, long‑running scenarios + t.Parallel are risky for default CI.

All these scenarios send up to 150 real HTTP/chat requests each while running in parallel. This is fragile for normal CI (slow, subject to provider variability, transient network issues).

Consider either (a) dropping t.Parallel() for these heavy tests, or (b) gating the whole file behind an integration build tag / env flag (e.g. //go:build integration or if os.Getenv("INTEGRATION_TESTS") != "1" { t.Skip(...) }), so they only run in explicit integration suites.

Also applies to: 179-334, 340-528, 536-651, 659-792, 798-951, 957-1098


88-133: Switching tests don’t assert that budget exhaustion was actually observed.

In TestVKSwitchTeamAfterBudgetExhaustion, TestVKSwitchCustomerAfterBudgetExhaustion, and TestHierarchicalChainBudgetSwitch, the loops break on a budget error but the tests only assert consumedBudget >= <limit>. If pricing or enforcement changes such that no “budget” rejection is ever returned, these tests can still pass.

Mirror the sawBudgetRejection pattern used in the later “update after exhaustion” tests:

  • Track a sawBudgetRejection boolean, set it when CheckErrorMessage(t, resp, "budget") succeeds.
  • After the loop, t.Fatalf if !sawBudgetRejection before proceeding to the switch logic.

That turns these into reliable guards that truly validate “after exhaustion” semantics instead of only checking approximate spend.

Also applies to: 253-297, 455-491


1246-1259: Cascade deletion test still allows orphaned rate limits.

TestVKDeletionCascadeComplete now correctly fails if VK/provider‑config budgets remain in memory, but still treats leftover rate limits as “acceptable” and only logs a note.

Given the governance store enforces 1:1 relationships between parents and their rate limits, VK and provider‑config rate limits should also be removed when the VK is deleted; leaving them orphaned contradicts that invariant and weakens this regression guard. Based on learnings, budgets and rate limits shouldn’t be shared across entities.

Recommend failing the test if either vkRateLimitStillExists or pcRateLimitStillExists is true, mirroring the budget assertion.


1148-1150: Manual deletion without registering in GlobalTestData risks leaking test fixtures.

Several tests deliberately skip adding entities to GlobalTestData (“Don’t add to testData since we’ll delete manually”). If any of these tests fails or aborts before reaching the explicit DELETE calls, those VKs/teams/customers will never be cleaned up, polluting global state for later tests.

Since Cleanup already tolerates 404s, it’s safe to both add IDs to GlobalTestData and delete them manually in the scenario. That way:

  • Manual deletes remain for the behavior under test, and
  • Cleanup ensures no resources leak on early failure.

Recommend always calling testData.AddVirtualKey/AddTeam/AddCustomer for created entities, even when you also delete them in the test body, and just rely on the existing 404‑tolerant cleanup.

Also applies to: 1293-1295, 1397-1399, 1501-1503, 1599-1601

plugins/governance/rate_limit_enforcement_test.go (3)

486-491: Fragile integer‑to‑string conversion in request content.

"Request " + string(rune('0'+i)) only produces correct digits for i in [0,9] and is easy to misread.

Prefer a proper formatter:

-                        Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+                        Content: fmt.Sprintf("Request %d to test both limits.", i+1),

(Requires importing fmt in this file.)


521-615: In‑memory rate‑limit tracking test doesn’t actually validate usage.

TestRateLimitInMemoryUsageTracking:

  • Makes a request,
  • Extracts rate_limit_id, and
  • Only logs that the rate limit “is configured and tracking usage”.

It never checks the corresponding entry in /api/governance/rate-limits?from_memory=true or asserts that any usage counters increased, so it won’t detect regressions in tracking.

Consider:

  • Fetching the rate limit by ID from the rate_limits map.
  • Asserting that token_current_usage (or request_current_usage) increased relative to zero after the request.

That would make this test a real guard for usage accounting, consistent with the stronger usage tests elsewhere.


478-518: “Both limits enforced” test still accepts any 4xx/5xx as success.

In TestProviderAndVKRateLimitBothEnforced, the first non‑200 response short‑circuits as success as long as at least providerRequestLimit requests succeeded, regardless of whether the error is actually a rate‑limit (e.g., 500 or auth error).

Tighten this by:

  • Asserting resp.StatusCode == 429 (or whatever canonical rate‑limit code your API uses), and/or
  • Using CheckErrorMessage with “rate”/“request” keywords for all non‑200 responses before treating them as expected.

Otherwise this test can pass while provider limits are not actually being enforced.

For typical HTTP APIs, is 429 the standard status code used to signal rate limiting?
plugins/governance/rate_limit_test.go (3)

54-55: Use proper formatting instead of string(rune('0'+i)).

Several tests build request content with expressions like:

"Short test request " + string(rune('0'+i)) + " ..."

This only works cleanly for single digits and is harder to read.

Switch to fmt.Sprintf or strconv.Itoa:

Content: fmt.Sprintf("Short test request %d for token limit.", i+1),

Apply similarly to the other sites.

Also applies to: 127-128, 206-207, 285-286, 643-644


490-506: Sync/usage tests index into response bodies without status/shape checks.

In tests like TestRateLimitInMemorySync, TestRateLimitUsageTrackedInMemory, and TestProviderLevelRateLimitUsageTracking, code such as:

virtualKeysMap := getDataResp.Body["virtual_keys"].(map[string]interface{})
vkData := virtualKeysMap[vkValue].(map[string]interface{})
rateLimitsMap := getRateLimitsResp.Body["rate_limits"].(map[string]interface{})
rateLimit := rateLimitsMap[rateLimitID].(map[string]interface{})

assumes:

  • The HTTP status is 200, and
  • All fields exist with the expected types.

If the backing API returns an error or a different shape, these assertions will panic and obscure the real problem.

Recommend:

  • Asserting resp.StatusCode == 200 before indexing.
  • Using comma‑ok type assertions for each map cast and failing the test with a clear t.Fatalf if any step is missing or the wrong type.

This matches the safer pattern you already use in some other tests.

Also applies to: 705-737, 872-890, 917-942


8-79: Rate‑limit tests can pass without ever observing enforcement.

In:

  • TestVirtualKeyTokenRateLimit
  • TestVirtualKeyRequestRateLimit
  • TestProviderConfigTokenRateLimit
  • TestProviderConfigRequestRateLimit

the loop:

  • Returns early when a 4xx+ with “token”/“request”/“rate” is seen (good), but
  • If all requests succeed, the tests simply log Made X successful requests… and pass.

That means these tests don’t actually assert that rate limits are enforced; they only confirm that requests can be made.

Instead, require that at least one request is rejected for rate‑limit reasons:

  • Track a sawRateLimitRejection boolean; set it when a 429 or CheckErrorMessage(..., "rate")/"token"/"request" matches.
  • After the loop, t.Fatalf if the flag is still false (or if you never saw a 429).

This aligns these “basic” tests with the stricter semantics used in rate_limit_enforcement_test.go.

Also applies to: 81-152, 154-231, 233-310

plugins/governance/test_utils.go (3)

143-152: Avoid reseeding math/rand on every generateRandomID call.

rand.Seed(time.Now().UnixNano()) inside generateRandomID:

  • Is deprecated in newer Go versions (global RNG auto‑seeds).
  • Can introduce subtle races in parallel tests if called concurrently.

You can safely drop the seeding line and rely on the default global seeding:

 func generateRandomID() string {
-    rand.Seed(time.Now().UnixNano())
     const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
     b := make([]byte, 8)
     for i := range b {
         b[i] = letters[rand.Intn(len(letters))]
     }
     return string(b)
 }

If you need deterministic or isolated randomness later, consider a local rand.Rand with its own source.


247-268: ExtractIDFromResponse ignores keyPath, making the API misleading.

The function signature accepts keyPath but the implementation only looks under hard‑coded keys virtual_key, team, and customer, never using keyPath.

Either:

  • Implement keyPath traversal (e.g. split on "." and walk nested maps accordingly), or
  • Remove the keyPath parameter from the signature and all call sites to reflect actual behavior.

As it stands, callers may assume custom paths are honored when they’re silently ignored.


90-141: MakeRequest should be a proper test helper and set an HTTP timeout.

Right now it:

  • Doesn’t call t.Helper(), so failures are attributed to inside the helper.
  • Uses a bare http.Client{} with no timeout, which can hang tests indefinitely if localhost:8080 is slow or down.

Recommend:

-func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
-    client := &http.Client{}
+func MakeRequest(t *testing.T, req APIRequest) *APIResponse {
+    t.Helper()
+    client := &http.Client{
+        Timeout: 30 * time.Second,
+    }

This keeps failures pointing at the calling test and avoids suite hangs on network issues.

plugins/governance/e2e_test.go (2)

802-841: Calling MakeRequest (uses testing.T) inside goroutines is unsafe.

TestConcurrentRequestsToSameVK invokes MakeRequest(t, ...) from multiple goroutines. MakeRequest uses t.Fatalf internally, and Go’s testing package does not support calling T methods from goroutines other than the test’s own; this can produce confusing failures and races.

Prefer a goroutine‑safe pattern:

  • Either refactor MakeRequest to return an error instead of calling t.Fatalf, and check errors in the main test goroutine, or
  • Wrap the goroutine body with a defer/recover block that records panics/errors into a shared slice, then after wg.Wait() fail once in the main goroutine if any occurred.

This matches the safer concurrency pattern already discussed for other governance tests.


1491-1539: Fallback test doesn’t assert that OpenAI actually handled the requests.

TestProviderFallbackMechanism currently:

  • Counts how many requests to model: "gpt-4o" succeed, and
  • Logs the model field,

but never checks which provider actually served the traffic. The test will pass even if all requests go to Anthropic (or fail for non‑fallback reasons) as long as some succeed.

Align this with the weighted LB test by:

  • Inspecting extra_fields.provider (or a similar explicit provider field) for each successful response.
  • Asserting that successful responses are attributed to the OpenAI provider and not Anthropic when requesting gpt-4o.
  • Optionally failing if any successful response reports Anthropic as the provider.

That turns the fallback scenario into a real guardrail instead of a logging‑only check.

🧹 Nitpick comments (3)
plugins/governance/config_update_sync_test.go (1)

184-232: "Higher limit" branch could use a more meaningful assertion.

The test only logs tokenUsageAfterSecondUpdate without asserting. Since usage is already 0 at this point, this section doesn't verify preservation behavior. Consider consuming tokens before raising the limit to meaningfully test that usage is preserved.

plugins/governance/advanced_scenarios_test.go (1)

83-85: Unchecked type assertions on response bodies can panic and hide root causes.

Patterns like:

vk := createVKResp.Body["virtual_key"].(map[string]interface{})
vkValue := vk["value"].(string)

(and similar .(map[string]interface{}) / .(string) throughout the file) will panic if the shape of the response changes or an earlier step fails, instead of failing with a clear test error.

For better diagnostics, prefer the comma‑ok form with explicit failures:

vkObj, ok := createVKResp.Body["virtual_key"].(map[string]interface{})
if !ok {
    t.Fatalf("Response missing virtual_key object: %v", createVKResp.Body)
}
vkValue, ok := vkObj["value"].(string)
if !ok || vkValue == "" {
    t.Fatalf("virtual_key.value missing or not a string: %v", vkObj)
}

Same applies to other nested map assertions in these tests.

Also applies to: 248-250, 452-454, 562-563, 703-705, 862-864, 991-993, 1151-1153, 1313-1314, 1417-1418, 1522-1523, 1620-1621

plugins/governance/e2e_test.go (1)

1404-1419: Provider detection in weighted LB test should guard extra_fields shape.

TestWeightedProviderLoadBalancing directly chains:

if provider, ok := resp.Body["extra_fields"].(map[string]interface{})["provider"].(string); ok {
    model, ok := resp.Body["extra_fields"].(map[string]interface{})["model_requested"].(string)
    ...
}

If extra_fields is missing or not a map[string]interface{}, this will panic instead of failing with a useful message.

Refactor to a safer sequence:

extra, ok := resp.Body["extra_fields"].(map[string]interface{})
if !ok {
    t.Fatalf("Response missing extra_fields: %v", resp.Body)
}
provider, ok := extra["provider"].(string)
// handle !ok
model, ok := extra["model_requested"].(string)
// handle !ok

This keeps the provider classification robust as response shapes evolve.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c958eb and 0753937.

⛔ Files ignored due to path filters (1)
  • plugins/governance/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/vk_budget_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/governance/edge_cases_test.go
  • plugins/governance/changelog.md
  • plugins/governance/team_budget_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/provider_budget_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/customer_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/vk_budget_test.go
  • plugins/governance/go.mod
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/customer_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/vk_budget_test.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/customer_budget_test.go
  • plugins/governance/store_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/vk_budget_test.go
🧬 Code graph analysis (11)
plugins/governance/customer_budget_test.go (1)
plugins/governance/test_utils.go (6)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • ExtractIDFromResponse (248-268)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/store_test.go (5)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
ui/lib/types/governance.ts (4)
  • Customer (37-44)
  • Team (27-35)
  • RateLimit (13-25)
  • Budget (5-11)
plugins/governance/config_update_sync_test.go (5)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/rate_limit_test.go (4)
plugins/governance/test_utils.go (10)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
  • UpdateVirtualKeyRequest (210-218)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/in_memory_sync_test.go (1)
plugins/governance/test_utils.go (12)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • CreateTeamRequest (183-187)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
plugins/governance/usage_tracking_test.go (5)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • BudgetRequest (177-180)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/rate_limit_enforcement_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
plugins/governance/advanced_scenarios_test.go (4)
plugins/governance/test_utils.go (14)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatMessage (242-245)
  • CalculateCost (64-73)
  • UpdateVirtualKeyRequest (210-218)
  • UpdateBudgetRequest (196-199)
  • UpdateTeamRequest (221-224)
  • UpdateCustomerRequest (227-230)
  • ProviderConfigRequest (167-174)
  • CreateRateLimitRequest (202-207)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/resolver_test.go (7)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
plugins/governance/resolver.go (11)
  • NewBudgetResolver (70-75)
  • DecisionAllow (18-18)
  • DecisionVirtualKeyNotFound (19-19)
  • DecisionVirtualKeyBlocked (20-20)
  • DecisionProviderBlocked (26-26)
  • DecisionModelBlocked (25-25)
  • DecisionTokenLimited (23-23)
  • DecisionRequestLimited (24-24)
  • DecisionBudgetExceeded (22-22)
  • DecisionRateLimited (21-21)
  • Decision (15-15)
core/schemas/context.go (1)
  • BifrostContext (31-41)
ui/lib/types/governance.ts (5)
  • VirtualKey (61-80)
  • RateLimit (13-25)
  • Budget (5-11)
  • Customer (37-44)
  • Team (27-35)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (117-117)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/vk_budget_test.go (1)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CalculateCost (64-73)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (16)
plugins/governance/go.mod (1)

10-10: LGTM - testify added for governance tests.

The addition of github.com/stretchr/testify v1.11.1 as a direct dependency is appropriate given the extensive test suite being added.

plugins/governance/vk_budget_test.go (1)

1-6: LGTM - imports are minimal and appropriate.

The test file uses only strconv and testing, relying on package-level test utilities for the rest.

plugins/governance/store_test.go (2)

1-15: LGTM - Good test structure and imports.

The test file properly imports required packages and uses require/assert from testify for assertions.


68-101: LGTM - Concurrent reads test is well-designed.

Good use of sync.WaitGroup, atomic counters, and clear verification of 10,000 successful reads with 0 errors.

plugins/governance/usage_tracking_test.go (1)

1-128: Good test structure for rate limit reset verification.

The test correctly creates a VK with a 30s reset duration, consumes tokens, and waits for the reset ticker. Consider adding the actual assertion for completeness.

plugins/governance/in_memory_sync_test.go (1)

1-130: LGTM - VK update sync test is well-structured.

Good coverage of in-memory state verification after VK creation and budget updates. The test properly checks for VK existence and budget propagation.

plugins/governance/resolver_test.go (5)

15-37: LGTM - Happy path test is clear and well-structured.

Tests the basic allowed request flow with proper assertions on decision and VK found status.


79-106: LGTM - Provider blocking test correctly validates filtering.

Good coverage of provider allowlist enforcement with clear test setup.


196-237: LGTM - Expired rate limit test correctly simulates ticker behavior.

The test properly resets expired rate limits before evaluation, verifying that the system treats expired limits as reset.


403-511: LGTM - Table-driven tests for provider and model filtering.

Good use of table-driven tests for isProviderAllowed and isModelAllowed with comprehensive edge cases (no configs, empty allowlists, matching/non-matching).


513-551: LGTM - Context population test validates governance metadata.

Verifies that BifrostContext is correctly populated with VK, team, and customer IDs after evaluation.

plugins/governance/customer_budget_test.go (2)

8-63: LGTM - Test setup for customer budget with multiple VKs is well-structured.

Creates a customer with budget and two VKs correctly, with proper cleanup registration.


162-239: LGTM - Multi-team customer budget test has good hierarchical setup.

Creates customer → teams → VKs hierarchy correctly to test shared customer budget across teams.

plugins/governance/config_update_sync_test.go (3)

1-46: LGTM - Test structure and setup are well-organized.

Clear section headers, parallel execution, and proper cleanup registration.


586-764: LGTM - Team budget update sync test uses WaitForCondition properly.

Good use of polling with timeout for async state verification (lines 688-702, 732-748), which is more robust than fixed sleeps.


956-1123: LGTM - Provider budget sync test completes the coverage.

Tests provider-level budget updates with the same pattern, ensuring in-memory sync works across all entity types.

Copy link
Contributor

akshaydeo commented Dec 24, 2025

Merge activity

  • Dec 24, 12:11 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Dec 24, 12:12 PM UTC: @akshaydeo merged this pull request with Graphite.

@akshaydeo akshaydeo changed the base branch from graphite-base/1041 to v1.4.0 December 24, 2025 12:12
@akshaydeo akshaydeo merged commit e42e8da into v1.4.0 Dec 24, 2025
3 checks passed
@akshaydeo akshaydeo deleted the 12-09-chore_governance_tests_added branch December 24, 2025 12:12
@coderabbitai coderabbitai bot mentioned this pull request Dec 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants