TQ: Implement prepare and commit for initial config #8682
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Initial configurations can be prepared and committed with the implemented handlers. This is tested along with aborts at Nexus for when the coordinator for the initial configuration has crashed in a new property based test.
The new property based test runs all possible nodes in the universe as the system under test (SUT), rather than running only the coordinator. This allows a full deterministic simulation of the protocol and checking of invariants at all nodes. It's also easier to write and understand as we don't have to capture and mock replies to the coordinator. I had always intended to write this test, but started with modelling the coordinator first since I thought it would be easier to incrementally build the protocol that way. However, it appears just as easy to incrementally build with all nodes as the SUT.
The new test does not have a model of the system, which is exceedingly hard to do for such a protocol. Instead the test checks invariants of the real state of the SUT after every action, and allows peppering in postconditions as necessary for each action or operation.
The Node API has also changed to not worry about time at all, and instead deals in terms of connections and disconnections. This makes for simpler code IMO, and matches what was done for LRTQ. We always are operating over sprockets streams, which run over TLS over TCP and so it makes little sense to model things as if arbitrary packets can get dropped and reordered.
As a result of the new proptest and the change in time usage, I've decided to drop the coordinator test altogether. It's too complicated for its value add and urgency is a priority.