-
Notifications
You must be signed in to change notification settings - Fork 1.2k
test: wait for new trigger for up to 1s while mining in maturity window #6921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR fixes a race condition in the feature_governance.py functional test where mining blocks too quickly during the superblock maturity window caused multiple masternodes to create competing governance triggers simultaneously. The competing triggers voted against each other, preventing any trigger from achieving a positive AbsoluteYesCount and causing intermittent test failures. The fix splits the superblock mining loop into two phases: batch-mine through the immaturity window, then mine block-by-block through the maturity window while polling for a valid trigger with a 1-second timeout. The do_assert=False flag allows mining to continue if no consensus emerges within 1s, preventing false test failures while still ensuring a valid trigger exists before the final superblock is mined.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| test/functional/feature_governance.py | 5/5 | Refactored superblock mining loop to wait up to 1s for trigger consensus on each maturity window block, fixing race condition |
Confidence score: 5/5
- This PR is safe to merge with minimal risk; it's a test-only change that doesn't affect production code
- Score reflects that the fix correctly addresses the identified race condition using appropriate test framework primitives (
wait_untilwith timeout/do_assert), and the logic aligns with the governance protocol's two-phase trigger creation window - No files require special attention; the single-file change is straightforward and well-motivated by the PR description
Sequence Diagram
sequenceDiagram
participant Test as Test Framework
participant Node0 as Node[0]
participant Nodes as Other Nodes
participant Governance as Governance System
Test->>Node0: bump_mocktime(1)
Test->>Node0: generate(1 block)
Node0->>Governance: Mine block in maturity window
Governance->>Governance: Process trigger creation
Test->>Nodes: wait_until(have_trigger_for_height, timeout=1s)
alt Trigger found within 1s
Nodes-->>Test: Trigger exists
Test->>Test: Continue immediately
else No trigger after 1s
Nodes-->>Test: Timeout (no trigger)
Test->>Test: Move to next block
end
Test->>Node0: bump_mocktime(1)
Test->>Node0: generate(1 block - superblock)
Node0->>Governance: Mine superblock with trigger
Test->>Node0: Check superblock payments
Context used:
- Context from
dashboard- CLAUDE.md (source)
1 file reviewed, 1 comment
| for _ in range(sb_maturity_window - 1): | ||
| self.bump_mocktime(1) | ||
| self.generate(self.nodes[0], 1, sync_fun=self.sync_blocks()) | ||
| self.wait_until(lambda: have_trigger_for_height(self.nodes, sb_block_height), timeout=1, do_assert=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: lambda captures sb_block_height from the loop iteration. Python's late binding means this works correctly only because wait_until is evaluated before the loop variable changes. Consider using a default argument if this pattern is reused elsewhere. Is there a risk that the lambda will capture the wrong sb_block_height if wait_until is deferred or if the loop iteration completes before evaluation?
Prompt To Fix With AI
This is a comment left during a code review.
Path: test/functional/feature_governance.py
Line: 382:382
Comment:
**style:** lambda captures `sb_block_height` from the loop iteration. Python's late binding means this works correctly only because `wait_until` is evaluated before the loop variable changes. Consider using a default argument if this pattern is reused elsewhere. Is there a risk that the lambda will capture the wrong `sb_block_height` if `wait_until` is deferred or if the loop iteration completes before evaluation?
How can I resolve this? If you propose a fix, please make it concise.|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThe test file modifies governance synchronization to use immaturity and maturity windows instead of the full superblock cycle for block advancement. It replaces open-ended advancement with exact-step sequences: advance by Sequence Diagram(s)sequenceDiagram
participant Test as Test Logic
participant Chain as Blockchain
participant Governance as Governance State
rect rgba(230,126,34,0.08)
Note over Test,Chain: Old Flow (conceptual)
Test->>Chain: Loop until next Superblock (sb_cycle)
Chain->>Governance: Update governance state per block
end
rect rgba(52,152,219,0.08)
Note over Test,Governance: New Flow (windowed)
Test->>Test: Compute n = sb_immaturity_window (assert n > 0)
Test->>Chain: For i in 1..n: advance time & produce block
Chain->>Governance: Update state per block
Note over Test,Governance: Enter maturity-window phase
Test->>Chain: For j in 1..(sb_maturity_window - 1): advance time & produce block
Chain->>Governance: Update state per block
Test->>Governance: Wait for trigger at sb_block_height (have_trigger_for_height)
end
rect rgba(46,204,113,0.06)
Note over Test: Outer loop refactor
Test->>Chain: Repeat: advance sb_immaturity_window blocks
Test->>Chain: Then iterate sb_maturity_window - 1 with height-based waits
Chain->>Governance: Trigger/vote sync occurs within windows
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)test/functional/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧬 Code graph analysis (1)test/functional/feature_governance.py (2)
🪛 Ruff (0.14.3)test/functional/feature_governance.py380-380: Function definition does not bind loop variable (B023) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
test/functional/feature_governance.py (3)
352-359: Make n computation robust; avoid brittle assert.If we're already at the start of the maturity window, n becomes 0 and the assert would fail. Use modulo to safely handle all phases and drop the assert.
- n = sb_immaturity_window - block_count % sb_cycle - assert n > 0 + # Distance to the next maturity-window start (0..sb_immaturity_window) + n = (sb_immaturity_window - (block_count % sb_cycle)) % sb_cycle + # If n == 0 we're already at the start; just proceed.
360-367: LGTM: per-block 1s wait reduces flakiness in maturity window.Matches the PR goal: mine, briefly wait for a winner; if none, continue. Consider extracting this repeated pattern into a tiny helper to DRY the later loop.
375-383: Bind loop variable in lambdas to silence Ruff B023 and avoid late-binding pitfalls.Capture sb_block_height via default arg so each wait uses the intended height.
- self.wait_until(lambda: have_trigger_for_height(self.nodes, sb_block_height), timeout=1, do_assert=False) + self.wait_until(lambda h=sb_block_height: have_trigger_for_height(self.nodes, h), timeout=1, do_assert=False) @@ - self.wait_until(lambda: have_trigger_for_height(self.nodes, sb_block_height)) + self.wait_until(lambda h=sb_block_height: have_trigger_for_height(self.nodes, h))Based on static analysis hints.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
test/functional/feature_governance.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
test/functional/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Functional tests should be written in Python and placed in test/functional/
Files:
test/functional/feature_governance.py
🧬 Code graph analysis (1)
test/functional/feature_governance.py (2)
src/node/interfaces.cpp (1)
info(162-189)test/functional/test_framework/test_framework.py (4)
bump_mocktime(880-897)generate(806-809)sync_blocks(826-846)wait_until(907-908)
🪛 Ruff (0.14.1)
test/functional/feature_governance.py
382-382: Function definition does not bind loop variable sb_block_height
(B023)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64-build / Build source
- GitHub Check: arm-linux-build / Build source
- GitHub Check: mac-build / Build source
- GitHub Check: Lint / Run linters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK f56ce76
This PR fixed this failure for me:
2025-11-06T16:31:54.221000Z TestFramework (INFO): Should see same YES and NO vote count for both triggers on all nodes now
...
AssertionError: Predicate ''''
self.wait_until(lambda: node.gobject("list", "valid", "triggers")[isolated_trigger_hash]['YesCount'] == 1, timeout=5)
''' not true after 5 seconds
Also I haven't noticed any significant performance drop, consider my patch
| self.log.info("Move remaining n blocks until the next Superblock") | ||
| for _ in range(n - 1): | ||
| self.log.info("Move remaining n blocks until the next maturity window") | ||
| for _ in range(n): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems as out of maturity window is not important if generate blocks one-by-one or in batch.
Consider db94922
It makes functional test in average 6 seconds faster (~10%)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied. I remember that generating many blocks at once caused some node sync issues back then but maybe it's no longer an issue, let's see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied. I remember that generating many blocks at once caused some node sync issues back then but maybe it's no longer an issue, let's see.
it still causes sometimes in other functional tests, but in all cases that I have seen batch size should be more than 50 blocks to create problematic situation. Here's batches are never bigger than 20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 6a13041
Issue being fixed or feature implemented
Mining too fast can result in creation of multiple triggers none of which is going to win - too many "no" votes from competing mns for
AbsoluteYesCountto go above 0.What was done?
Wait for new trigger for up to 1s while mining in maturity window. If no winning trigger was found in 1s we move to the next block. When there is a winning trigger
wait_untilis basically instant.How Has This Been Tested?
Run multiple
feature_governance.pyjobs in parallel.develop: some jobs fail on
wait_until(lambda: have_trigger_for_height(...))this PR: should be no to little difference in test duration times comparing to develop with no failures on
wait_until(lambda: have_trigger_for_height(...))Breaking Changes
Checklist: