-
Notifications
You must be signed in to change notification settings - Fork 74
network backoff resolution strategy #2775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This PR is part of a stack of 11 bookmarks:
Created with jj-stack |
|
Claude finished @insipx's task —— View job PR Review - November 14, 2025 14:43 UTC
Code Review FeedbackCritical Issue: Potential Infinite Loop in network_backoff.rsLocation: The retry logic may not properly handle cases where the network repeatedly returns the same incomplete set of envelopes. If Recommendation: Consider adding detection for when consecutive queries return identical results (no progress), and exit early with an unresolved error rather than continuing to retry. Semantic Issue:
|
3e28f7c to
817c4af
Compare
4e13b77 to
cfaca01
Compare
817c4af to
0503a91
Compare
cfaca01 to
dd48074
Compare
0503a91 to
e0f1dbf
Compare
dd48074 to
585234a
Compare
585234a to
f0eef63
Compare
4949287 to
e79eff2
Compare
f908274 to
e787b05
Compare
e79eff2 to
36331a4
Compare
36331a4 to
c8e94be
Compare
c8e94be to
94609e2
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2775 +/- ##
==========================================
- Coverage 75.00% 74.85% -0.16%
==========================================
Files 376 377 +1
Lines 48451 48498 +47
==========================================
- Hits 36340 36302 -38
- Misses 12111 12196 +85 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
94609e2 to
7e16ea2
Compare
e787b05 to
86404a0
Compare
7e16ea2 to
d15896d
Compare
86404a0 to
3aa9b6e
Compare
960a3a9 to
2dc468b
Compare
3aa9b6e to
21889f8
Compare
2dc468b to
41862ac
Compare
Try to order envelopes by their dependencies and resolve any that are missing. if resolution fails, drop the missing envelopes and continue with the query. the ordering extension is modified again in #2775 to remove envelopes from `missing`. i may backport those changes to this pr for consistency
41862ac to
930d4c8
Compare
930d4c8 to
91edc55
Compare
Introduce a backoff-based network dependency resolver and update query paths to use
Topictypes acrossQueryEnvelopein network_backoff.rs and related modulesAdd
NetworkBackoffResolverwith exponential backoff, changeResolveDependenciesto returnResolved, addmerge_leasttoVectorClock/GlobalCursor, and switchQueryEnvelopetopics toTopicvalues with byte encoding at call time.📍Where to Start
Start with the
ResolveDependenciestrait changes in dependency_resolution.rs, then review theresolveimplementation in network_backoff.rs and the ordering updates in order.rs.📊 Macroscope summarized 930d4c8. 10 files reviewed, 7 issues evaluated, 6 issues filtered, 1 comment posted
🗂️ Filtered Issues
xmtp_api_d14n/src/endpoints/d14n/query_envelopes.rs — 0 comments posted, 1 evaluated, 1 filtered
TopicKind::GroupMessagesV1.create(vec![])(empty bytes) and then unconditionally unwraps the gRPC result withapi::ignore(endpoint).query(&client).await.unwrap(). Prior behavior explicitly matched an error containing "invalid topic"; the new assertion of success will panic when the server correctly rejects the invalid topic. This is a regression making the test fail/panic under realistic conditions. Fix by either supplying a valid, non-empty topic or restoring explicit error assertion/handling. [ Test / Mock code ]xmtp_api_d14n/src/protocol/impls/vector_clock.rs — 0 comments posted, 1 evaluated, 1 filtered
merge_leastincorrectly initializes missing entries with0viaself.entry(node).or_insert(0)and then appliesmin(*entry = (*entry).min(seq)), which forces any absent node to become0regardless ofother'sseq. This silently loses information and can reduce existing clocks to0, violating the intended "take the lowest between self and other" semantics and creating spurious0entries. It should insertseqwhen the key is absent (e.g.,or_insert(seq)or conditionally applyminonly if an entry exists), matchingGlobalCursor::apply_least. [ Already posted ]xmtp_api_d14n/src/protocol/order.rs — 0 comments posted, 1 evaluated, 1 filtered
Ordered::order(), when handling partially unresolved dependencies, the code tries to remove entries frommissingby checkingif unresolved.contains(&m.cursor()?)and then retaining elements based on that.unresolvedis built fromMissingEnvelopecursors (the unresolved dependency cursors), whilemiterates over the envelopes that were missing dependencies (typeT). Comparing the dependent envelope’scursor()to the dependency cursors is mismatched and will typically never match, so unresolved dependents are not removed frommissing. This breaks the intended behavior to drop envelopes whose dependencies couldn’t be resolved, potentially causing repeated re-insertion and churn. [ Already posted ]xmtp_proto/src/types/cursor.rs — 0 comments posted, 2 evaluated, 2 filtered
fmt::Displayimplementation changed the external string format from"[sid[{}]:oid[{}]]"to"[sid({}):oid({})]". If any downstream logic parses or pattern-matches theCursordisplay output (e.g., logs ingested by tooling, UIs, tests, or protocol text), this change can cause runtime misparsing or failures. The code provides no guard, migration, or dual-format support, so consumers expecting the old format may break at runtime. [ Already posted ]comparing cursors is unsafe/undefined behavior if originator ids are not equal.However, thetest_orderingtest asserts ordering across different originators (e.g.,Cursor::new(1, 1u32) < Cursor::new(1, 2u32)is expectedtrue). This contradicts the stated contract and enshrines cross-originator comparisons, creating uncertainty about intended semantics. If production code relies onOrd/PartialOrdacross originators, runtime behavior may be semantically invalid per the doc note; if comparisons should be restricted, tests should reject cross-originator ordering. [ Out of scope ]xmtp_proto/src/types/global_cursor.rs — 0 comments posted, 1 evaluated, 1 filtered
GlobalCursor::apply_leastinitializes a missing entry withcursor.sequence_idviaor_insert(cursor.sequence_id), then appliesminonly when the key already exists. This violates the visible invariant that a missing originator is treated as0(seeget()returning0for absent keys) and creates an asymmetry with least-merge semantics. For a missing originator, the least value betweenself(treated as0) andcursor.sequence_idshould be0, but the current code insertscursor.sequence_id. This can incorrectly raise the clock for new originators during a "least" application and diverges from theVectorClock::merge_leastbehavior that usesor_insert(0)beforemin. Fix: changeor_insert(cursor.sequence_id)toor_insert(0)so the min is applied consistently for both present and absent keys. [ Already posted ]