Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 18 additions & 18 deletions plugins/governance/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package governance

import (
"context"
"errors"
"fmt"
"strings"
"sync"
Expand Down Expand Up @@ -545,27 +544,28 @@ func (gs *LocalGovernanceStore) DumpBudgets(ctx context.Context, baselines map[s
budgets := gs.GetAllBudgets()
budgetsToDelete := make([]string, 0)
if err := gs.configStore.ExecuteTransaction(ctx, func(tx *gorm.DB) error {
// Update each budget atomically
// Update each budget atomically using direct UPDATE to avoid deadlocks
// (SELECT + Save pattern causes deadlocks when multiple instances run concurrently)
for _, inMemoryBudget := range budgets {
// Check if budget exists in database
var budget configstoreTables.TableBudget
if err := tx.WithContext(ctx).First(&budget, "id = ?", inMemoryBudget.ID).Error; err != nil {
// If budget not found then it must be deleted, so we remove it from the in-memory store
if errors.Is(err, gorm.ErrRecordNotFound) {
budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID)
continue
}
return fmt.Errorf("failed to get budget %s: %w", inMemoryBudget.ID, err)
// Calculate the new usage value
newUsage := inMemoryBudget.CurrentUsage
if baseline, exists := baselines[inMemoryBudget.ID]; exists {
newUsage += baseline
}

// Update usage
if baseline, exists := baselines[inMemoryBudget.ID]; exists {
budget.CurrentUsage = inMemoryBudget.CurrentUsage + baseline
} else {
budget.CurrentUsage = inMemoryBudget.CurrentUsage
// Direct UPDATE avoids read-then-write lock escalation that causes deadlocks
result := tx.WithContext(ctx).
Model(&configstoreTables.TableBudget{}).
Where("id = ?", inMemoryBudget.ID).
Update("current_usage", newUsage)

if result.Error != nil {
return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error)
}
if err := tx.WithContext(ctx).Save(&budget).Error; err != nil {
return fmt.Errorf("failed to save budget %s: %w", inMemoryBudget.ID, err)

// If no rows affected, budget was deleted from database
if result.RowsAffected == 0 {
budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID)
}
Comment on lines +547 to 569
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Do not treat RowsAffected == 0 as "budget deleted" – this will silently drop active budgets.

The new RowsAffected == 0 check is unsafe: on SQLite (Bifrost's default) and MySQL, an UPDATE that sets a column to its existing value reports RowsAffected == 0 even when the row still exists. In this function that means:

  • Any budget whose current_usage hasn't changed since the last flush will be interpreted as "deleted from database".
  • Its ID is added to budgetsToDelete, and then removed from gs.budgets via the deletion loop below this block.
  • Subsequent budget checks that rely on collectBudgetsFromHierarchy / gs.budgets will silently stop enforcing those budgets.

That's a correctness bug: live budgets disappear from governance enforcement just because their usage was stable between dumps.

Given the deadlock fix goal, a safe minimal change is to keep the direct UPDATE but stop mutating in‑memory state based solely on RowsAffected. You can still log this condition for observability and add a more dialect-aware existence check later if needed.

Concrete suggestion for this block:

-        // Direct UPDATE avoids read-then-write lock escalation that causes deadlocks
+        // Direct UPDATE avoids read-then-write lock escalation that causes deadlocks
         result := tx.WithContext(ctx).
             Model(&configstoreTables.TableBudget{}).
             Where("id = ?", inMemoryBudget.ID).
             Update("current_usage", newUsage)

         if result.Error != nil {
             return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error)
         }

-        // If no rows affected, budget was deleted from database
-        if result.RowsAffected == 0 {
-            budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID)
-        }
+        // NOTE: RowsAffected == 0 is ambiguous across drivers (e.g. SQLite/MySQL when the
+        // value doesn't actually change). Don't treat this as "budget deleted" to avoid
+        // silently dropping valid budgets from the in-memory store. If we need automatic
+        // cleanup of deleted budgets, we should add a dialect-aware existence check instead.

With this change, budgetsToDelete will remain empty and the cleanup loop below becomes a no-op, preserving existing in‑memory budgets behavior while still fixing the deadlock by using a direct UPDATE.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Update each budget atomically using direct UPDATE to avoid deadlocks
// (SELECT + Save pattern causes deadlocks when multiple instances run concurrently)
for _, inMemoryBudget := range budgets {
// Check if budget exists in database
var budget configstoreTables.TableBudget
if err := tx.WithContext(ctx).First(&budget, "id = ?", inMemoryBudget.ID).Error; err != nil {
// If budget not found then it must be deleted, so we remove it from the in-memory store
if errors.Is(err, gorm.ErrRecordNotFound) {
budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID)
continue
}
return fmt.Errorf("failed to get budget %s: %w", inMemoryBudget.ID, err)
// Calculate the new usage value
newUsage := inMemoryBudget.CurrentUsage
if baseline, exists := baselines[inMemoryBudget.ID]; exists {
newUsage += baseline
}
// Update usage
if baseline, exists := baselines[inMemoryBudget.ID]; exists {
budget.CurrentUsage = inMemoryBudget.CurrentUsage + baseline
} else {
budget.CurrentUsage = inMemoryBudget.CurrentUsage
// Direct UPDATE avoids read-then-write lock escalation that causes deadlocks
result := tx.WithContext(ctx).
Model(&configstoreTables.TableBudget{}).
Where("id = ?", inMemoryBudget.ID).
Update("current_usage", newUsage)
if result.Error != nil {
return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error)
}
if err := tx.WithContext(ctx).Save(&budget).Error; err != nil {
return fmt.Errorf("failed to save budget %s: %w", inMemoryBudget.ID, err)
// If no rows affected, budget was deleted from database
if result.RowsAffected == 0 {
budgetsToDelete = append(budgetsToDelete, inMemoryBudget.ID)
}
// Update each budget atomically using direct UPDATE to avoid deadlocks
// (SELECT + Save pattern causes deadlocks when multiple instances run concurrently)
for _, inMemoryBudget := range budgets {
// Calculate the new usage value
newUsage := inMemoryBudget.CurrentUsage
if baseline, exists := baselines[inMemoryBudget.ID]; exists {
newUsage += baseline
}
// Direct UPDATE avoids read-then-write lock escalation that causes deadlocks
result := tx.WithContext(ctx).
Model(&configstoreTables.TableBudget{}).
Where("id = ?", inMemoryBudget.ID).
Update("current_usage", newUsage)
if result.Error != nil {
return fmt.Errorf("failed to update budget %s: %w", inMemoryBudget.ID, result.Error)
}
// NOTE: RowsAffected == 0 is ambiguous across drivers (e.g. SQLite/MySQL when the
// value doesn't actually change). Don't treat this as "budget deleted" to avoid
// silently dropping valid budgets from the in-memory store. If we need automatic
// cleanup of deleted budgets, we should add a dialect-aware existence check instead.
🤖 Prompt for AI Agents
In plugins/governance/store.go around lines 547 to 569, the code treats
result.RowsAffected == 0 as "budget deleted", which is incorrect on SQLite/MySQL
where UPDATE that doesn't change values yields 0 affected rows; remove the logic
that appends IDs to budgetsToDelete based solely on RowsAffected, and instead
only log the zero-rows-affected case for observability (or perform a
dialect-aware existence check if needed later), so the direct UPDATE remains to
avoid deadlocks but in-memory budgets are not removed when usage was unchanged.

}
return nil
Expand Down