-
Notifications
You must be signed in to change notification settings - Fork 150
deadlock fix #1049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deadlock fix #1049
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughModified the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
plugins/governance/store.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
plugins/governance/store.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
plugins/governance/store.go
🧬 Code graph analysis (1)
plugins/governance/store.go (2)
core/schemas/models.go (1)
Model(109-129)framework/configstore/tables/budget.go (2)
TableBudget(11-20)TableBudget(23-23)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
| // Update each budget atomically using direct UPDATE to avoid deadlocks | ||
| // (SELECT + Save pattern causes deadlocks when multiple instances run concurrently) | ||
| for _, inMemoryBudget := range budgets { | ||
| // Check if budget exists in database | ||
| var budget configstoreTables.TableBudget | ||
| if err := tx.WithContext(ctx).First(&budget, "id = ?", inMemoryBudget.ID).Error; err != nil { | ||
| // If budget not found then it must be deleted, so we remove it from the in-memory store | ||
| if errors.Is(err, gorm.ErrRecordNotFound) { | ||
| budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID) | ||
| continue | ||
| } | ||
| return fmt.Errorf("failed to get budget %s: %w", inMemoryBudget.ID, err) | ||
| // Calculate the new usage value | ||
| newUsage := inMemoryBudget.CurrentUsage | ||
| if baseline, exists := baselines[inMemoryBudget.ID]; exists { | ||
| newUsage += baseline | ||
| } | ||
|
|
||
| // Update usage | ||
| if baseline, exists := baselines[inMemoryBudget.ID]; exists { | ||
| budget.CurrentUsage = inMemoryBudget.CurrentUsage + baseline | ||
| } else { | ||
| budget.CurrentUsage = inMemoryBudget.CurrentUsage | ||
| // Direct UPDATE avoids read-then-write lock escalation that causes deadlocks | ||
| result := tx.WithContext(ctx). | ||
| Model(&configstoreTables.TableBudget{}). | ||
| Where("id = ?", inMemoryBudget.ID). | ||
| Update("current_usage", newUsage) | ||
|
|
||
| if result.Error != nil { | ||
| return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error) | ||
| } | ||
| if err := tx.WithContext(ctx).Save(&budget).Error; err != nil { | ||
| return fmt.Errorf("failed to save budget %s: %w", inMemoryBudget.ID, err) | ||
|
|
||
| // If no rows affected, budget was deleted from database | ||
| if result.RowsAffected == 0 { | ||
| budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not treat RowsAffected == 0 as "budget deleted" – this will silently drop active budgets.
The new RowsAffected == 0 check is unsafe: on SQLite (Bifrost's default) and MySQL, an UPDATE that sets a column to its existing value reports RowsAffected == 0 even when the row still exists. In this function that means:
- Any budget whose
current_usagehasn't changed since the last flush will be interpreted as "deleted from database". - Its ID is added to
budgetsToDelete, and then removed fromgs.budgetsvia the deletion loop below this block. - Subsequent budget checks that rely on
collectBudgetsFromHierarchy/gs.budgetswill silently stop enforcing those budgets.
That's a correctness bug: live budgets disappear from governance enforcement just because their usage was stable between dumps.
Given the deadlock fix goal, a safe minimal change is to keep the direct UPDATE but stop mutating in‑memory state based solely on RowsAffected. You can still log this condition for observability and add a more dialect-aware existence check later if needed.
Concrete suggestion for this block:
- // Direct UPDATE avoids read-then-write lock escalation that causes deadlocks
+ // Direct UPDATE avoids read-then-write lock escalation that causes deadlocks
result := tx.WithContext(ctx).
Model(&configstoreTables.TableBudget{}).
Where("id = ?", inMemoryBudget.ID).
Update("current_usage", newUsage)
if result.Error != nil {
return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error)
}
- // If no rows affected, budget was deleted from database
- if result.RowsAffected == 0 {
- budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID)
- }
+ // NOTE: RowsAffected == 0 is ambiguous across drivers (e.g. SQLite/MySQL when the
+ // value doesn't actually change). Don't treat this as "budget deleted" to avoid
+ // silently dropping valid budgets from the in-memory store. If we need automatic
+ // cleanup of deleted budgets, we should add a dialect-aware existence check instead.With this change, budgetsToDelete will remain empty and the cleanup loop below becomes a no-op, preserving existing in‑memory budgets behavior while still fixing the deadlock by using a direct UPDATE.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Update each budget atomically using direct UPDATE to avoid deadlocks | |
| // (SELECT + Save pattern causes deadlocks when multiple instances run concurrently) | |
| for _, inMemoryBudget := range budgets { | |
| // Check if budget exists in database | |
| var budget configstoreTables.TableBudget | |
| if err := tx.WithContext(ctx).First(&budget, "id = ?", inMemoryBudget.ID).Error; err != nil { | |
| // If budget not found then it must be deleted, so we remove it from the in-memory store | |
| if errors.Is(err, gorm.ErrRecordNotFound) { | |
| budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID) | |
| continue | |
| } | |
| return fmt.Errorf("failed to get budget %s: %w", inMemoryBudget.ID, err) | |
| // Calculate the new usage value | |
| newUsage := inMemoryBudget.CurrentUsage | |
| if baseline, exists := baselines[inMemoryBudget.ID]; exists { | |
| newUsage += baseline | |
| } | |
| // Update usage | |
| if baseline, exists := baselines[inMemoryBudget.ID]; exists { | |
| budget.CurrentUsage = inMemoryBudget.CurrentUsage + baseline | |
| } else { | |
| budget.CurrentUsage = inMemoryBudget.CurrentUsage | |
| // Direct UPDATE avoids read-then-write lock escalation that causes deadlocks | |
| result := tx.WithContext(ctx). | |
| Model(&configstoreTables.TableBudget{}). | |
| Where("id = ?", inMemoryBudget.ID). | |
| Update("current_usage", newUsage) | |
| if result.Error != nil { | |
| return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error) | |
| } | |
| if err := tx.WithContext(ctx).Save(&budget).Error; err != nil { | |
| return fmt.Errorf("failed to save budget %s: %w", inMemoryBudget.ID, err) | |
| // If no rows affected, budget was deleted from database | |
| if result.RowsAffected == 0 { | |
| budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID) | |
| } | |
| // Update each budget atomically using direct UPDATE to avoid deadlocks | |
| // (SELECT + Save pattern causes deadlocks when multiple instances run concurrently) | |
| for _, inMemoryBudget := range budgets { | |
| // Calculate the new usage value | |
| newUsage := inMemoryBudget.CurrentUsage | |
| if baseline, exists := baselines[inMemoryBudget.ID]; exists { | |
| newUsage += baseline | |
| } | |
| // Direct UPDATE avoids read-then-write lock escalation that causes deadlocks | |
| result := tx.WithContext(ctx). | |
| Model(&configstoreTables.TableBudget{}). | |
| Where("id = ?", inMemoryBudget.ID). | |
| Update("current_usage", newUsage) | |
| if result.Error != nil { | |
| return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error) | |
| } | |
| // NOTE: RowsAffected == 0 is ambiguous across drivers (e.g. SQLite/MySQL when the | |
| // value doesn't actually change). Don't treat this as "budget deleted" to avoid | |
| // silently dropping valid budgets from the in-memory store. If we need automatic | |
| // cleanup of deleted budgets, we should add a dialect-aware existence check instead. |
🤖 Prompt for AI Agents
In plugins/governance/store.go around lines 547 to 569, the code treats
result.RowsAffected == 0 as "budget deleted", which is incorrect on SQLite/MySQL
where UPDATE that doesn't change values yields 0 affected rows; remove the logic
that appends IDs to budgetsToDelete based solely on RowsAffected, and instead
only log the zero-rows-affected case for observability (or perform a
dialect-aware existence check if needed later), so the direct UPDATE remains to
avoid deadlocks but in-memory budgets are not removed when usage was unchanged.

Summary
Fix deadlocks in budget persistence by replacing the SELECT + UPDATE pattern with direct UPDATE statements.
Changes
errorspackage as it's no longer neededType of change
Affected areas
How to test
Run multiple instances of the application concurrently to verify that budget persistence no longer causes deadlocks:
Breaking changes
Related issues
Fixes deadlocks that occur when multiple instances attempt to persist budgets simultaneously.
Security considerations
No security implications.
Checklist
docs/contributing/README.mdand followed the guidelines