Skip to content

Add StaticRound to eliminate some boilerplate when writing protocols #117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

fjarri
Copy link
Member

@fjarri fjarri commented Jun 7, 2025

Main changes:

  • Round now uses associated types for messages, payloads, and artifacts. Fixes Typed Round trait #65. Most dynamic parts (BoxedFormat, ProtocolMessagePart, serialization/deserialization of messages etc) are hidden from the user.
  • Evidence verification methods are moved from Protocol to corresponding Round implementors. This allows one to keep the related logic (receive_message() and evidence verification) together.

Corollary changes:

For reviewers

  • misbehave was renamed to extend (with the corresponding renames for types and traits), because it seemed to me like a better wording, but it is still intended for the same purpose: misbehavior tests. Should I bring back the original name?
  • extend is not really a combinator, it's more of a dynamic override API. I am not sure how useful it is outside tests. Should it be moved to dev?
  • BoxedRound is the only "boxed" part of the API visible to user. Is there a better name for it? (e.g. RoundInfo is also "boxed", but doesn't have a Boxed prefix) I don't think it can be hidden entirely as long as we allow finalizing into several different rounds.

@dvdplm
Copy link
Collaborator

dvdplm commented Jun 9, 2025

I have skimmed the code, so this is not a review.

Thoughts:

  • manul is likely never going to be a performance bottleneck
  • …so we can "afford" to prioritize readability, ergonomics, correctness and maintainability.
  • I think we should make opinionated choices, aiming at giving users one correct way to do what they need to do.

Given the above I think we should provide either dynamic or static rounds, but not both. When we have a concrete use case that requires both dynamic and static we can re-assess.

Is switching to only static rounds possible? It'd be interesting to see how synedrion would look with only static rounds and if there'd be less boilerplate.

@fjarri
Copy link
Member Author

fjarri commented Jun 9, 2025

Is switching to only static rounds possible?

For protocols themselves, yes, but evidence verification is not limited to one round and needs access to the message types from the previous rounds - so the user would have to manually transform them from some untyped form to typed messages.

Although it is possible to automate deserialization (and the handling of its errors) - by making every Round type responsible for deserializing its messages and putting them in a Box<>, which the user would later downcast to a specific type. A possible error for this operation is a LocalError.

This would need some kind of "routing" of round number -> boxed round type (not the round object, since we don't need the round state for that). This may be also used to simplify evidence checking for invalid messages (#82).

@fjarri fjarri force-pushed the static-round branch 2 times, most recently from 8de19fb to 0ef0c98 Compare June 10, 2025 21:35
@coveralls
Copy link

coveralls commented Jun 10, 2025

Pull Request Test Coverage Report for Build 16578768783

Details

  • 653 of 1298 (50.31%) changed or added relevant lines in 21 files are covered.
  • 76 unchanged lines in 11 files lost coverage.
  • Overall coverage decreased (-5.2%) to 66.91%

Changes Missing Coverage Covered Lines Changed/Added Lines %
manul/src/dev/run_sync.rs 6 8 75.0%
manul/src/session/transcript.rs 0 2 0.0%
manul/src/session/message.rs 2 8 25.0%
manul/src/session/wire_format.rs 5 11 45.45%
manul/src/protocol/dyn_evidence.rs 28 35 80.0%
manul/src/protocol/errors.rs 0 8 0.0%
manul/src/protocol/rng.rs 0 12 0.0%
manul/src/protocol/round_id.rs 22 34 64.71%
manul/src/session/session.rs 27 39 69.23%
manul/src/protocol/message.rs 8 29 27.59%
Files with Coverage Reduction New Missed Lines %
manul/src/dev/tokio.rs 1 96.67%
manul/src/session/transcript.rs 1 59.56%
manul/src/session/echo.rs 2 51.63%
manul/src/session/message.rs 2 89.78%
manul/src/dev/run_sync.rs 4 83.43%
manul/src/protocol/message.rs 5 61.62%
manul/src/protocol/round_id.rs 6 60.0%
manul/src/combinators/chain.rs 7 48.52%
manul/src/session/session.rs 9 79.66%
manul/src/session/tokio.rs 16 81.03%
Totals Coverage Status
Change from base Build 15695642792: -5.2%
Covered Lines: 2291
Relevant Lines: 3424

💛 - Coveralls

@fjarri
Copy link
Member Author

fjarri commented Jun 11, 2025

Current roadblock: handling protocols where some nodes do not send or do not receive messages (e.g. KeyResharing). This means that the type of the round is not enough to determine whether a message is expected, it's also the state that's needed.

In the n-of-n case the static rounds eliminate the need for verify_*_is_invalid() methods, and also allow one to split evidence checking into per-round methods, making them much easier to maintain.

@fjarri fjarri force-pushed the static-round branch 4 times, most recently from 2b67199 to 49bb410 Compare June 29, 2025 23:25
@fjarri fjarri force-pushed the static-round branch 4 times, most recently from 175e8e4 to 733fd6b Compare July 15, 2025 00:29
@fjarri fjarri marked this pull request as ready for review July 18, 2025 18:55
Copy link
Collaborator

@dvdplm dvdplm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another partial review.

Ok(())
fn round_info(round_id: &RoundId) -> Option<RoundInfo<DinerId, Self>> {
match round_id {
_ if round_id == 1 => Some(RoundInfo::new::<Round1>()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ if round_id == 1 is a bit awkward looking, and I think the PartialEq impl between RoundId and RoundNum actually makes it a bit worse. I tinkered a bit and came up with a few alternatives.

  1. More explicit, without any changes to other code:
    fn round_info(round_id: &RoundId) -> Option<RoundInfo<DinerId, Self>> {
        if round_id == 1 {
            Some(RoundInfo::new::<Round1>())
        } else if round_id == 2 {
            Some(RoundInfo::new::<Round1>())
        } else {
            None
        }
    }
  1. Instead of using the PartialEq impl, add a round() method to RoundId to make it clear what we're matching on:
    fn round_info(round_id: &RoundId) -> Option<RoundInfo<DinerId, Self>> {
        match round_id.round() {
            1 => Some(RoundInfo::new::<Round1>()),
            2 => Some(RoundInfo::new::<Round2>()),
            _ => None,
        }
    }
  1. Same as 2 but adding a to_info() method on the Round trait so we can emphasize the round itself:
    fn round_info(round_id: &RoundId) -> Option<RoundInfo<DinerId, Self>> {
        match round_id.round() {
            1 => Some(Round1::to_info()),
            2 => Some(Round2::to_info()),
            _ => None,
        }
    }

Adding the to_info method to the Round-trait requires adding + Sized to the bounds, which might be problematic, otherwise I think 3) is the version I prefer. Regardless of which version you prefer going with, I think the PartialEq impl between RoundNum and RoundId is "too clever" and forces the reader to go deeper into the internals than what is actually useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the comparison part would be fixed by #120. I like the to_info() approach, but it's an orthogonal issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options 2 and 3 implicitly drop the groups from round ID, which can lead to silent errors. Yes, technically at that point round IDs should not have groups, but if there's some bug in the code, they might.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_info() would require a method with a default implementation in the Round trait, which I would really like to make non-overridable, but that involves some sealed trait magic.

@@ -217,6 +200,14 @@ impl Round<DinerId> for Round1 {

impl Round<DinerId> for Round2 {
type Protocol = DiningCryptographersProtocol;
type ProtocolError = NoProtocolErrors<Self>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nitpick: the associated type is in the singular, so it's a bit odd to assign a value whose name is plural. OTOH calling it "NoProtocolError" isn't great either. Not sure there is a great solution to be found here (idiomatically it should really be () but that isn't possible as we saw above).

Maybe DummyProtocolError?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A random thought I had:
I wonder how far we'd be able to get if we changed the Round trait to make the type ProtocolError take a impl core::error::Error. Maybe then we could impl Round with () as the error type when there are no errors, but then we'd need some clever trick to transform or cast the impl core::error::Error into an actual ProtocolError with all the methods and trait bounds we need.

I read through rust-lang/rust#99301 which seems to be about ways to access data from nested errors in a generic way. Seems a bit stuck though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can instead rename the associated type to ProtocolErrors?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't the plural on a type name imply that it is a collection of types of protocol errors?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to impl ProtocolError for core::convert::Infallible (described as "The error type for errors that can never happen." in the docs) and it kinda works if it wasn't for the ser/deser bounds on ProtocolError. That's sort of what I was trying to hint at by saying "it'd be so nice if we could use a impl core::error::Error and then – handwaves – transfom/cast to concrete error types": we could have less bounds.

Anyway, all of this is nitpicking. The code is fine as-is, modulo perhaps the name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can instead rename the associated type to ProtocolErrors?

This is better.

I still think it's awkward that there isn't a better way to do this but that has nothing to do with this PR.

};

/// An extension to a round, allowing one to extend or override its methods.
pub trait Extension<Id>: 'static + Debug + Send + Sync + Clone {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While "Extension" works it is a bit too generic imo. It is specific to a round so what do you think about RoundExt? That would be in line with other extension traits (https://rust-lang.github.io/rfcs/0445-extension-trait-conventions.html#the-convention).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the quoted doc applies here, since it's not an extension of the Round trait, but rather an override. But I have no problem with RoundExt per se.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a tricky one; you're right that it's not technically an extension trait, but maybe RoundExt is still a better name?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again I still think Extension is too generic and that we need to have "round" in the name somewhere.

  • RoundExt is my preference
  • RoundOverride is explicit and clearly states intent. It also jives well with a struct OverriddenRound to replace ExtendedRound
  • RoundExtension is also a possibility I guess, but it's longer and could be misunderstood into making the round itself longer in some way.

/// A wrapper for a protocol's [`EntryPoint`], allowing registering [`Extension`] implementors
/// to extend or override [`Round`] methods.
#[derive(Debug)]
pub struct Extendable<Id: PartyId, EP: EntryPoint<Id>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is pub I think we need a less generic name. Maybe even the full ExtendableEntryPoint. I have a slight preference for Wrapped over Extendable though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again I think we should avoid naming structs with adjectives, given the (weakly enforced) convention is to use adjectives for traits.

My suggestion here is WrappedEntryPoint.

@@ -217,6 +200,14 @@ impl Round<DinerId> for Round1 {

impl Round<DinerId> for Round2 {
type Protocol = DiningCryptographersProtocol;
type ProtocolError = NoProtocolErrors<Self>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can instead rename the associated type to ProtocolErrors?

This is better.

I still think it's awkward that there isn't a better way to do this but that has nothing to do with this PR.

messages: EvidenceMessages<Id, Self::Round>,
) -> std::result::Result<(), EvidenceError> {
let _message: Round1Message = messages.direct_message()?;
// Message contents would be checked here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Message contents would be checked here
// Message content would be checked here

Given this is an example and is supposed to be educational it'd be good to elaborate on what checks one would do here, e.g. "check signatures, size limits, sessions id" (or whatever).


let message = Round1Broadcast {
type Payload = Round1Payload;
type Artifact = ();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the other comment I still struggle with this.

If I understand things correctly this should be read as "Round 1 sends a direct message but does not use an artifact". In this sense, it is perfectly idiomatic to use ().

When the new manul user later discovers that there exists a NoArtifact type, I think there'll be confusion: which should I use?

The rule is:

  • If the round uses direct messages but no artifact, use () here.
  • If the round does not use direct messages , use NoArtifact here.

If the user makes a mistake, it will be impossible to impl make_direct_message but I don't think we do enough to help them figure out that they need to come here and set the associated type properly.

There's some kind of cognitive mismatch here that I can't quite put my finger on. The artifact is not merely a piece of "Associated data created alongside a message in [Self::make_direct_message]". It is actually doing two jobs, the other being to act like a "switch" to make it possible/impossible to send direct messages.

The old signature of make_direct_message was Result<(DirectMessage, Option<Artifact>), LocalError> and I think that the Option<Artifact> made the intention more clear.

I'd have to dig in deeper to know what the consequences of removing this associated type would be, but I think this is a wart.

The Round docs hints at this with "Set to [NoArtifact] if [Self::DirectMessage] is [NoMessage].", but doesn't give a deeper understanding and does not spell out when to use () and when to use NoArtifact. I think at the very least that we need to explain this properly both in docs and examples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aim is to do something like

trait Round {
    type DirectMessage: (MessageType, ArtifactType);
}

That is, the user need to either specify both of these types, or none. It is not possible to do directly in the current Rust, but kind of possible with some additional constructs, like a trait that provides an API for "splitting" a value into a message and an artifact. The problem is that this trait would have to be implemented for every direct message.

When the new manul user later discovers that there exists a NoArtifact type, I think there'll be confusion: which should I use?

There may be confusion, but at least NoArtifact is impossible to instantiate, so it can't be used.

The old signature of make_direct_message was Result<(DirectMessage, Option), LocalError>

The problem with it is that it implies that some calls may return an artifact and some not, which is not really what we want.

Ok(Payload::new(Round1Payload))
message: ProtocolMessage<Id, Self>,
) -> Result<Self::Payload, ReceiveError<Id, Self>> {
assert!(message.direct_message.0 == do_work(self.round_counter + 2));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was a debug_assert would it be triggered in CI you think, when we run the benchmarks one time? If yes, then we could switch to that and avoid the work here.

};

/// An extension to a round, allowing one to extend or override its methods.
pub trait Extension<Id>: 'static + Debug + Send + Sync + Clone {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again I still think Extension is too generic and that we need to have "round" in the name somewhere.

  • RoundExt is my preference
  • RoundOverride is explicit and clearly states intent. It also jives well with a struct OverriddenRound to replace ExtendedRound
  • RoundExtension is also a possibility I guess, but it's longer and could be misunderstood into making the round itself longer in some way.

Comment on lines +171 to +172
// TODO (#4): we reuse `EchoBroadcast::none()` (that means `NoMessage` in the typed round)
// to have a second meaning, the node not sending messages at all.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO still valid?

use crate::protocol::{DeserializationError, LocalError};
use crate::protocol::LocalError;

/// An error that can be returned during deserialization error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// An error that can be returned during deserialization error.
/// An error that can be returned during deserialization.

Comment on lines +96 to +98
// TODO (#4): this branch is unreachable in the absense of bugs in the code
// (the method will not be called in the first place if the node does not send messages).
// Can it be eliminated using the type system?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're convinced about this, let's use unreachable!("This node does not send messages in this round. This is a bug in manul") or something.

let verified = self
.normal_broadcast
.clone()
.verify::<SP>(verifier)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as I made elsewhere: I find it a bit odd that a verify method consumes self. It should be a read-only operation.

Maybe we can inline the code from verify and end up with something like

        let digest = self.normal_broadcast.message_with_metadata.digest::<SP>()?;
        let signature = self.normal_broadcast.signature.deserialize::<SP>()?;
        verifier
            .verify_digest(digest, &signature)
            .map_err(MessageVerificationError::into_evidence_error)?;

        let deserialized = self
            .normal_broadcast
            .payload()
            .deserialize::<EchoRoundMessage<SP::Verifier>>(format)
            .map_err(|error| {
                EvidenceError::InvalidEvidence(format!("Failed to deserialize normal broadcast: {error:?}"))
            })?;
        self.error
            .verify_evidence::<SP>(self.normal_broadcast.metadata(), &deserialized)

(broken code, just to illustrate)

artifacts: BTreeMap<Id, Self::Artifact>,
) -> Result<FinalizeOutcome<Id, Self::Protocol>, LocalError>;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round now has awesome docs, much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants