Skip to content

Conversation

@insipx
Copy link
Contributor

@insipx insipx commented Nov 13, 2025

Introduce a backoff-based network dependency resolver and update query paths to use Topic types across QueryEnvelope in network_backoff.rs and related modules

Add NetworkBackoffResolver with exponential backoff, change ResolveDependencies to return Resolved, add merge_least to VectorClock/GlobalCursor, and switch QueryEnvelope topics to Topic values with byte encoding at call time.

📍Where to Start

Start with the ResolveDependencies trait changes in dependency_resolution.rs, then review the resolve implementation in network_backoff.rs and the ordering updates in order.rs.


📊 Macroscope summarized 930d4c8. 10 files reviewed, 7 issues evaluated, 6 issues filtered, 1 comment posted

🗂️ Filtered Issues

xmtp_api_d14n/src/endpoints/d14n/query_envelopes.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 149: The test constructs an invalid topic via TopicKind::GroupMessagesV1.create(vec![]) (empty bytes) and then unconditionally unwraps the gRPC result with api::ignore(endpoint).query(&client).await.unwrap(). Prior behavior explicitly matched an error containing "invalid topic"; the new assertion of success will panic when the server correctly rejects the invalid topic. This is a regression making the test fail/panic under realistic conditions. Fix by either supplying a valid, non-empty topic or restoring explicit error assertion/handling. [ Test / Mock code ]
xmtp_api_d14n/src/protocol/impls/vector_clock.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 35: merge_least incorrectly initializes missing entries with 0 via self.entry(node).or_insert(0) and then applies min (*entry = (*entry).min(seq)), which forces any absent node to become 0 regardless of other's seq. This silently loses information and can reduce existing clocks to 0, violating the intended "take the lowest between self and other" semantics and creating spurious 0 entries. It should insert seq when the key is absent (e.g., or_insert(seq) or conditionally apply min only if an entry exists), matching GlobalCursor::apply_least. [ Already posted ]
xmtp_api_d14n/src/protocol/order.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 84: In Ordered::order(), when handling partially unresolved dependencies, the code tries to remove entries from missing by checking if unresolved.contains(&m.cursor()?) and then retaining elements based on that. unresolved is built from MissingEnvelope cursors (the unresolved dependency cursors), while m iterates over the envelopes that were missing dependencies (type T). Comparing the dependent envelope’s cursor() to the dependency cursors is mismatched and will typically never match, so unresolved dependents are not removed from missing. This breaks the intended behavior to drop envelopes whose dependencies couldn’t be resolved, potentially causing repeated re-insertion and churn. [ Already posted ]
xmtp_proto/src/types/cursor.rs — 0 comments posted, 2 evaluated, 2 filtered
  • line 70: The fmt::Display implementation changed the external string format from "[sid[{}]:oid[{}]]" to "[sid({}):oid({})]". If any downstream logic parses or pattern-matches the Cursor display output (e.g., logs ingested by tooling, UIs, tests, or protocol text), this change can cause runtime misparsing or failures. The code provides no guard, migration, or dual-format support, so consumers expecting the old format may break at runtime. [ Already posted ]
  • line 111: Documentation warns: comparing cursors is unsafe/undefined behavior if originator ids are not equal. However, the test_ordering test asserts ordering across different originators (e.g., Cursor::new(1, 1u32) < Cursor::new(1, 2u32) is expected true). This contradicts the stated contract and enshrines cross-originator comparisons, creating uncertainty about intended semantics. If production code relies on Ord/PartialOrd across originators, runtime behavior may be semantically invalid per the doc note; if comparisons should be restricted, tests should reject cross-originator ordering. [ Out of scope ]
xmtp_proto/src/types/global_cursor.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 46: GlobalCursor::apply_least initializes a missing entry with cursor.sequence_id via or_insert(cursor.sequence_id), then applies min only when the key already exists. This violates the visible invariant that a missing originator is treated as 0 (see get() returning 0 for absent keys) and creates an asymmetry with least-merge semantics. For a missing originator, the least value between self (treated as 0) and cursor.sequence_id should be 0, but the current code inserts cursor.sequence_id. This can incorrectly raise the clock for new originators during a "least" application and diverges from the VectorClock::merge_least behavior that uses or_insert(0) before min. Fix: change or_insert(cursor.sequence_id) to or_insert(0) so the min is applied consistently for both present and absent keys. [ Already posted ]

@insipx insipx requested a review from a team as a code owner November 13, 2025 20:01
This was referenced Nov 13, 2025
@insipx
Copy link
Contributor Author

insipx commented Nov 13, 2025

@claude
Copy link

claude bot commented Nov 13, 2025

Claude finished @insipx's task —— View job


PR Review - November 14, 2025 14:43 UTC

  • Read and analyze core implementation files
  • Review code quality and identify issues
  • Post review feedback

Code Review Feedback

Critical Issue: Potential Infinite Loop in network_backoff.rs

Location: xmtp_api_d14n/src/protocol/resolve/network_backoff.rs:81

The retry logic may not properly handle cases where the network repeatedly returns the same incomplete set of envelopes. If got contains some but not all required cursors, and the network keeps returning the same partial results, missing.retain() will keep the unresolved items indefinitely while attempts increments. Eventually the backoff will timeout, but this could waste considerable time and resources.

Recommendation: Consider adding detection for when consecutive queries return identical results (no progress), and exit early with an unresolved error rather than continuing to retry.


Semantic Issue: apply_least in global_cursor.rs

Location: xmtp_proto/src/types/global_cursor.rs:39-44

The apply_least function's documentation states it "takes the lowest value of SequenceId between Self and Cursor", but when the originator_id doesn't exist in self, it inserts cursor.sequence_id directly via or_insert(). This breaks the "least" semantics since the implicit value for a missing entry should be 0 (as seen in the get() method at line 48-49 which returns 0 for missing entries).

Recommendation: Change line 44 to .or_insert(0) to maintain consistent "least" semantics. This aligns with the bot's suggestion.


Low Test Coverage

Location: Multiple files, particularly network_backoff.rs

Codecov reports only 6.06% patch coverage with 62 uncovered lines. The new NetworkBackoffResolver has zero test coverage despite implementing complex retry logic with exponential backoff and cursor merging. The order.rs file at line 106 even has a TODO: tests comment.

Recommendation: Add unit tests covering:

  • Successful resolution after retries
  • Timeout/max attempts behavior
  • LCC computation with multiple topics
  • Edge cases (empty missing set, all unresolvable, partial resolution)

Code Quality: Inconsistent Error Handling

Location: xmtp_api_d14n/src/protocol/resolve/network_backoff.rs:75

The API error is wrapped in Box<dyn Error> via .map_err(|e| ResolutionError::Api(Box::new(e))), but the ResolutionError::Api variant already takes a Box<dyn Error>, so this creates an unnecessary allocation. The code works but adds overhead.

Recommendation: If the error type from query() already implements Error, consider defining a more specific error variant or using anyhow::Error to avoid double-boxing.


Performance: Redundant Cursor Collection

Location: xmtp_api_d14n/src/protocol/resolve/network_backoff.rs:77-80

The code collects envelope cursors into a HashSet<Cursor> with Result error handling, but then immediately uses it only for contains() checks. This creates intermediate allocations and error propagation overhead.

Recommendation: Consider whether cursor extraction can fail in practice. If not, use unwrap() or expect() with a clear message. If errors are expected, handle them more gracefully (skip invalid envelopes rather than failing the entire resolution).


Summary

The implementation introduces a solid foundation for network-based dependency resolution with backoff. The primary concerns are:

  1. Potential infinite retry loops when the network provides partial results
  2. Semantic bug in apply_least (already identified by bot)
  3. Severely lacking test coverage (6%)

The retain predicate bug in order.rs has already been correctly identified and acknowledged by the author.


@insipx insipx force-pushed the insipx/network-backoff branch from 3e28f7c to 817c4af Compare November 13, 2025 20:07
@insipx insipx force-pushed the insipx/network-backoff branch from 817c4af to 0503a91 Compare November 13, 2025 20:08
@insipx insipx force-pushed the insipx/network-backoff branch from 0503a91 to e0f1dbf Compare November 13, 2025 20:10
@insipx insipx force-pushed the insipx/network-backoff branch 4 times, most recently from 4949287 to e79eff2 Compare November 13, 2025 21:11
@insipx insipx force-pushed the insipx/resolution branch 2 times, most recently from f908274 to e787b05 Compare November 13, 2025 21:13
@insipx insipx force-pushed the insipx/network-backoff branch from e79eff2 to 36331a4 Compare November 13, 2025 21:13
@insipx insipx force-pushed the insipx/network-backoff branch from 36331a4 to c8e94be Compare November 13, 2025 21:20
@insipx insipx force-pushed the insipx/network-backoff branch from c8e94be to 94609e2 Compare November 13, 2025 21:28
@codecov
Copy link

codecov bot commented Nov 13, 2025

Codecov Report

❌ Patch coverage is 6.06061% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.85%. Comparing base (f77c10d) to head (91edc55).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...p_api_d14n/src/protocol/resolve/network_backoff.rs 0.00% 30 Missing ⚠️
xmtp_api_d14n/src/protocol/order.rs 0.00% 9 Missing ⚠️
..._d14n/src/protocol/traits/dependency_resolution.rs 0.00% 7 Missing ⚠️
xmtp_proto/src/types/global_cursor.rs 0.00% 7 Missing ⚠️
xmtp_api_d14n/src/protocol/impls/vector_clock.rs 0.00% 6 Missing ⚠️
...tp_api_d14n/src/protocol/types/missing_envelope.rs 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2775      +/-   ##
==========================================
- Coverage   75.00%   74.85%   -0.16%     
==========================================
  Files         376      377       +1     
  Lines       48451    48498      +47     
==========================================
- Hits        36340    36302      -38     
- Misses      12111    12196      +85     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@insipx insipx force-pushed the insipx/network-backoff branch from 94609e2 to 7e16ea2 Compare November 13, 2025 21:37
@insipx insipx force-pushed the insipx/network-backoff branch from 7e16ea2 to d15896d Compare November 13, 2025 22:20
@insipx insipx force-pushed the insipx/network-backoff branch 3 times, most recently from 960a3a9 to 2dc468b Compare November 13, 2025 22:26
@insipx insipx force-pushed the insipx/network-backoff branch from 2dc468b to 41862ac Compare November 13, 2025 22:28
insipx added a commit that referenced this pull request Nov 14, 2025
Try to order envelopes by their dependencies and resolve any that are
missing. if resolution fails, drop the missing envelopes and continue
with the query.



the ordering extension is modified again in #2775 to remove envelopes
from `missing`. i may backport those changes to this pr for consistency
Base automatically changed from insipx/resolution to main November 14, 2025 14:38
@insipx insipx force-pushed the insipx/network-backoff branch from 41862ac to 930d4c8 Compare November 14, 2025 14:39
@insipx insipx force-pushed the insipx/network-backoff branch from 930d4c8 to 91edc55 Compare November 14, 2025 14:41
@insipx insipx enabled auto-merge (squash) November 14, 2025 14:49
@insipx insipx merged commit 2b575f6 into main Nov 14, 2025
19 of 20 checks passed
@insipx insipx deleted the insipx/network-backoff branch November 14, 2025 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants