Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially corrupt connectivity after linear connectivity changes #654

Open
antiguru opened this issue Mar 19, 2025 · 2 comments
Open

Potentially corrupt connectivity after linear connectivity changes #654

antiguru opened this issue Mar 19, 2025 · 2 comments

Comments

@antiguru
Copy link
Member

When pointing Materialize and Differential at Timely with #651, I observe crashes in validate_progress:

Increment at (1742294913660, PointStamp { vector: [1] }), not supported by
        consumed: [ChangeBatch { updates: [], clean: 0 }, ChangeBatch { updates: [], clean: 0 }]
        internal: MutableAntichain { updates: ChangeBatch { updates: [((1742294913660, PointStamp { vector: [2] }), 1), ((1742390934535, PointStamp { vector: [] }), 1)], clean: 2 }, frontier: [(1742294913660, PointStamp { vector: [2] }), (1742390934535, PointStamp { vector: [] })], changes: ChangeBatch { updates: [], clean: 0 } }
2025-03-19T13:28:55.277366Z  thread 'timely:work-0' panicked at /home/moritz/.cargo/git/checkouts/timely-dataflow-dbb5eb8e114d5497/d0ea86f/timely/src/progress/subgraph.rs:790:25:
Progress error; internal "BuildingObject(System(535))"
   7: rust_begin_unwind
             at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/std/src/panicking.rs:692:5
   8: core::panicking::panic_fmt
             at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/core/src/panicking.rs:75:14
   9: <timely::progress::subgraph::PerOperatorState<timely::order::product::Product<mz_repr::timestamp::Timestamp, differential_dataflow::dynamic::pointstamp::PointStamp<u64>>>>::validate_progress
             at /home/moritz/.cargo/git/checkouts/timely-dataflow-dbb5eb8e114d5497/d0ea86f/timely/src/progress/subgraph.rs:790:25
  10: <timely::progress::subgraph::Subgraph<mz_repr::timestamp::Timestamp, timely::order::product::Product<mz_repr::timestamp::Timestamp, differential_dataflow::dynamic::pointstamp::PointStamp<u64>>>>::activate_child
             at /home/moritz/.cargo/git/checkouts/timely-dataflow-dbb5eb8e114d5497/d0ea86f/timely/src/progress/subgraph.rs:373:17
  11: <timely::progress::subgraph::Subgraph<mz_repr::timestamp::Timestamp, timely::order::product::Product<mz_repr::timestamp::Timestamp, differential_dataflow::dynamic::pointstamp::PointStamp<u64>>> as timely::scheduling::Schedule>::schedule
             at /home/moritz/.cargo/git/checkouts/timely-dataflow-dbb5eb8e114d5497/d0ea86f/timely/src/progress/subgraph.rs:319:17
  12: <timely::progress::subgraph::PerOperatorState<mz_repr::timestamp::Timestamp>>::schedule
             at /home/moritz/.cargo/git/checkouts/timely-dataflow-dbb5eb8e114d5497/d0ea86f/timely/src/progress/subgraph.rs:712:30
  13: <timely::progress::subgraph::Subgraph<(), mz_repr::timestamp::Timestamp>>::activate_child
             at /home/moritz/.cargo/git/checkouts/timely-dataflow-dbb5eb8e114d5497/d0ea86f/timely/src/progress/subgraph.rs:353:26
  14: <timely::progress::subgraph::Subgraph<(), mz_repr::timestamp::Timestamp> as timely::scheduling::Schedule>::schedule
             at /home/moritz/.cargo/git/checkouts/timely-dataflow-dbb5eb8e114d5497/d0ea86f/timely/src/progress/subgraph.rs:319:17
  15: <timely::worker::Wrapper>::step::{closure#0}

Repro: MaterializeInc/materialize#31947

Differential changes: TimelyDataflow/differential-dataflow#586

@frankmcsherry
Copy link
Member

Seems reasonable. Probably the right next step is to get a minimized repro (or .. just eat this one) and put a timely in place that runs both approaches in tandem and reports what's different. Probably would have been a better initial timely PR too (oops). Typing that out, it seems like a timely next step to "roll back" the PR, or at least to undo the deletion of the existing stuff and instead pair it up with the new stuff, makes the most sense.

@frankmcsherry
Copy link
Member

Talking out a plan: #651 converted Vec<Vec<Antichain<..>>> into Vec<BTreeMap<usize, Antichain<..>>>. The only real change in here, intended change at least, was to convert Vec<..> into BTreeMap<usize, ..>. These two structures should be almost exactly aligned, with the caveat that whenever the former has an empty antichain, that entry does not need to exist in the latter.

The PR also introduces names, Connectivity and PortConnectivity for the Vec<Vec<..>> and Vec<..> types being replaced. The former is a typedef, and while the latter is as well, it could become a type with either/both implementations. This was annoying in the moment, because one needs to implement indexing, iterator collection, stuff like that. But .. seems like a good idea to do now, in part because we can double check the surface area and be sure we aren't accidentally calling .len() or something that exists for both types but with different meanings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants