Skip to content

Prevent DEVICE_OVERLOAD during legacy sync and make remove spentExtraRecs field in favor of more robust checks#528

Open
oskarszoon wants to merge 7 commits intobsv-blockchain:mainfrom
oskarszoon:bugfix/legacy-unspend-context
Open

Prevent DEVICE_OVERLOAD during legacy sync and make remove spentExtraRecs field in favor of more robust checks#528
oskarszoon wants to merge 7 commits intobsv-blockchain:mainfrom
oskarszoon:bugfix/legacy-unspend-context

Conversation

@oskarszoon
Copy link
Contributor

No description provided.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 25, 2026

🤖 Claude Code Review

Status: Complete


Current Review:

This PR removes the spentExtraRecs counter mechanism and replaces it with direct child record verification, along with reducing spendBatcherConcurrency to prevent DEVICE_OVERLOAD errors during legacy sync.

Architecture Change:
The PR replaces an incremental counter (spentExtraRecs) with explicit batch verification (verifyAllChildrenSpent). For multi-record transactions, when all local UTXOs are spent, the system now reads all child records to confirm they are fully spent before setting DeleteAtHeight (DAH). This approach is more robust against interrupted rollbacks but adds read overhead for large transactions.

Key Changes:

  • Removed: spentExtraRecs bin, IncrementSpentRecordsMulti(), incrementBatcher
  • Added: verifyAllChildrenSpent() batch read verification
  • Modified: Lua setDeleteAtHeight() now signals Go to verify children instead of checking counter
  • Rollback safety: Uses context.Background() for unspend operations to ensure rollback completes even if parent context is cancelled
  • Performance: Reduced spendBatcherConcurrency from 32 to 4 to prevent overwhelming Aerospike during sync

Trade-offs:

  • Pro: Eliminates drift bugs from cancelled contexts during spend→unspend→respend cycles
  • Pro: Simpler state model (no counter synchronization)
  • Con: Every ALLSPENT transition triggers N child reads (where N = number of child records)
  • Con: Higher read load for transactions with many outputs (>UtxoBatchSize)

Analysis:

No critical issues found. The implementation appears sound with proper error handling and context management. The trade-off of read overhead for correctness is appropriate given the production failures this fixes.


Note: Previous inline comments about handle_block.go and context removal in unspendLua are no longer applicable to this version of the PR.

…unspend->respend actions with contexts being cancelled from causing drift in this incremented value
@oskarszoon oskarszoon changed the title Prevent DEVICE_OVERLOAD during legacy sync and make sure unspend completes Prevent DEVICE_OVERLOAD during legacy sync and make remove spentExtraRecs field in favor of more robust checks Feb 27, 2026
g, gCtx := errgroup.WithContext(context.Background()) // we don't want the tracing to be linked to these calls
util.SafeSetLimit(g, spendBatcherSize*spendBatcherConcurrency) // we limit the number of concurrent requests, to not overload Aerospike
g, gCtx := errgroup.WithContext(context.Background()) // we don't want the tracing to be linked to these calls
util.SafeSetLimit(g, spendBatcherSize*2) // keep enough goroutines in flight to feed the batcher without overwhelming Aerospike
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this more dynamic? Why not just reduce the spendBatcherConcurrency?

g, gCtx := errgroup.WithContext(context.Background()) // we don't want the tracing to be linked to these calls
util.SafeSetLimit(g, spendBatcherSize*spendBatcherConcurrency) // we limit the number of concurrent requests, to not overload Aerospike
g, gCtx := errgroup.WithContext(context.Background()) // we don't want the tracing to be linked to these calls
util.SafeSetLimit(g, spendBatcherSize*2) // keep enough goroutines in flight to feed the batcher without overwhelming Aerospike
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this more dynamic? Why not just reduce the spendBatcherConcurrency?

// - prometheusUtxoMapReset: Successful unspends
// - prometheusUtxoMapErrors: Failed operations
func (s *Store) unspendLua(ctx context.Context, spend *utxo.Spend) error {
func (s *Store) unspendLua(spend *utxo.Spend) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't remove the context from the function call, even if it is not being used now, convention. Should add a listener that stops execution when the context cancels.

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
50.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants