Skip to content

Port the synchronous event serialization core to Rust#19837

Open
erikjohnston wants to merge 28 commits into
developfrom
erikj/rust_event_serialization
Open

Port the synchronous event serialization core to Rust#19837
erikjohnston wants to merge 28 commits into
developfrom
erikj/rust_event_serialization

Conversation

@erikjohnston

@erikjohnston erikjohnston commented Jun 9, 2026

Copy link
Copy Markdown
Member

Summary

This moves the synchronous core of client event serialization out of synapse/events/utils.py and into Rust (rust/src/events/serialize.rs).

Event serialization is on the hot path for /sync, /messages, and most client-facing endpoints. Previously it was a recursive pure-Python routine (_serialize_event / _inject_bundled_aggregations / only_fields) that interleaved
CPU-bound formatting with async DB/IO. This PR separates those two concerns and moves the CPU-bound half to Rust:

  • Async prep stays in Python. EventClientSerializer._prepare_serialization does all DB/IO up front: batch-fetching redaction events and running the registered module unsigned-callback hooks, for the top-level events and every bundled sub-event (edits and thread latest events, which are themselves serialized). The admin/MSC4354 config is resolved once via _update_config, rather than re-checked on every recursive call as the old code did.
  • The synchronous core moves to Rust. Given an event plus the pre-fetched redaction_map, unsigned_additions, and bundle_aggregations, the Rust code produces the client JSON entirely in Rust — including the v1/v2 format transforms, only_event_fields filtering, redaction handling, and recursive bundled aggregations.

Details

  • The Rust entry point is a single batch function, serialize_events, taking a list of (event, membership) pairs. The three lookup maps are shared across the whole batch, so they're read out of Python and converted to Rust
    structures once per batch rather than once per event. EventClientSerializer.serialize_event (singular) is a thin wrapper that calls it with a one-element list.
  • SerializeEventConfig is now a Rust pyclass, and the old event_format callable is replaced by the EventFormat enum (Raw / ClientV1 / ClientV2 / ClientV2WithoutRoomId). Call sites in rest/admin/events.py and
    rest/client/{notifications,room,sync}.py are updated to pass the enum. make_config_for_admin and MSC4354 enablement now go through SerializeEventConfig.for_admin() / with_msc4354().
  • New accessors on EventInternalMetadata (redacted_by, txn_id, device_id, token_id, delay_id, soft_failed, policy_server_spammy) expose to Rust the fields the serializer reads.
  • The _split_field unit tests move from tests/events/test_utils.py to a Rust test in serialize.rs, since the implementation moved.

Behaviour

This is intended to be a behaviour-preserving refactor — the Rust core mirrors the previous Python output (field ordering, v1 key promotion, redaction placement per room version, null-redacts handling, transaction-ID gating).

Existing serialization, relations, and sync tests pass unchanged.

This comment was marked as outdated.

@erikjohnston erikjohnston force-pushed the erikj/rust_event_serialization branch 4 times, most recently from 7ea20c9 to 96a80d1 Compare June 16, 2026 09:27
erikjohnston and others added 2 commits June 16, 2026 10:32
Move `BundledAggregations` and `_ThreadAggregation` from the Python
`RelationsHandler` into frozen Rust pyclasses exposed from
`synapse.synapse_rust.events`. They are re-exported from
`synapse.handlers.relations` so existing call sites are unaffected.

The referenced events are stored by value (the `Event` pyclass now
derives a cheap, Arc-sharing `Clone`). The `references` field uses the
`JsonObject` type, which gains a `FromPyObject` impl so it can be built
from any Python mapping in place of a plain dict.

As the pyclass is immutable, `get_bundled_aggregations` is reworked to
collect each kind of aggregation and build one instance per event.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This is in prep for doing the actual serialization in Rust.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Comment thread rust/src/events/serialize.rs
Comment thread synapse/events/utils.py
erikjohnston and others added 2 commits June 16, 2026 11:20
Move the synchronous core of client event serialization out of
`synapse/events/utils.py` and into `rust/src/events/serialize.rs`. The
Python `EventClientSerializer` now performs all DB/IO (fetching
redactions, running module callbacks, resolving admin/MSC4354 config) up
front in `_prepare_serialization`, then hands the pre-fetched data to the
Rust `serialize_event`, which recurses entirely in Rust. Bundled
aggregations are read directly from the Rust `BundledAggregations`
pyclass.

`SerializeEventConfig` and the `event_format` callable become a Rust
pyclass and the `EventFormat` enum respectively; call sites are updated
to pass the enum.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@erikjohnston erikjohnston force-pushed the erikj/rust_event_serialization branch from 96a80d1 to 7d5dc63 Compare June 16, 2026 10:20
@erikjohnston erikjohnston marked this pull request as ready for review June 16, 2026 10:23
@erikjohnston erikjohnston requested a review from a team as a code owner June 16, 2026 10:23
Comment thread rust/src/events/formats/mod.rs Outdated
Comment on lines +97 to +104
/// The `signatures` and `unsigned` fields are kept separate from the other
/// fields as they are mutable (and must be deep-copied if the event is cloned).
/// `common_fields` and `specific_fields` are both `#[serde(flatten)]`ed so that
/// the serialised JSON is a single flat object matching the Matrix spec.
/// fields as they are mutable. Note the derived [`Clone`] is *shallow*: it
/// shares the mutable `signatures`/`unsigned`/internal state behind their
/// `Arc`s (cheap, and fine for read-only uses such as bundled aggregations).
/// Use [`FormattedEvent::deep_copy`] when an independently-mutable copy is
/// required. `common_fields` and `specific_fields` are both
/// `#[serde(flatten)]`ed so that the serialised JSON is a single flat object
/// matching the Matrix spec.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Clone seems like a new footgun

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes. Not really sure I see much alternative though.

If this was a pure Rust class we could get rid of the interior mutability and have a top level Arc, however to make Python work we need to be able to get a mutable version of EventInternalMetadata and that requires interior mutability.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we do what was previously suggested: "and must be deep-copied if the event is cloned"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, suppose we could. Really, we want a reference to the same event here, in that we could also store it as a Py<Event> (I somewhat want to avoid storing python references as then you have to partake in the GC). Event::deep_copy is really only intended for if you want a completely new copy of an event that you can edit (which is very rare, mostly we just share a reference to the same event).

I suppose we could also implement a shallow_clone() method instead, so as to force users to choose?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a shallow_copy in 1ace531, though in practice it has highlighted a couple of things:

  1. This means that other things (like ThreadAggregation) has to manually derive clone.
  2. Actually, cloning isn't as cheap as I thought as it will copy the immutable bits in FormattedEvent that aren't in an Arc.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed tack and made the aggregations structs store a Py<Event> for now, and dropped both Clone and shallow_copy.

It's not entirely clear to me if we have to implement __traverse__, since we know there won't be any reference cycles between the relation classes and the Event. Nonetheless I've implemented it to make sure we don't accidentally leak memory.

c.f. https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration and https://docs.python.org/3/c-api/gcsupport.html

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is FormattedEvent meant to still have Clone? and the docstring is accurate? (hasn't changed)


Feels like using Py<Event> means we're forever tying ourselves to Python when we want this code to stand on its own in a pure Rust codebase.


It's not entirely clear to me if we have to implement __traverse__, since we know there won't be any reference cycles between the relation classes and the Event. Nonetheless I've implemented it to make sure we don't accidentally leak memory.

c.f. https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration and https://docs.python.org/3/c-api/gcsupport.html

To make it more clear, why aren't we also implementing __clear__?

Without __clear__, it seems like the latest_event Event will forever be stuck (can't be GC'ed)


These methods are part of the C API, PyPy does not necessarily honor them. If you are building for PyPy you should measure memory consumption to make sure you do not have runaway memory growth. See this issue on the PyPy bug tracker.

-- https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration

Do we care about PyPy at all?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is FormattedEvent meant to still have Clone? and the docstring is accurate? (hasn't changed)

Removed the clones.

Feels like using Py<Event> means we're forever tying ourselves to Python when we want this code to stand on its own in a pure Rust codebase.

I see this as a stepping stone while we figure out what we want to do about Event. I'm somewhat leaning towards having struct Event(Arc<EventInner>), but that is a bigger refactor.

To make it more clear, why aren't we also implementing __clear__?

Without __clear__, it seems like the latest_event Event will forever be stuck (can't be GC'ed)

Have added a comment to __traverse__, but basically __clear__ is only need to break reference cycles, which we know we don't have. (This is also the reason I'm not sure we need to implement __traverse__)

These methods are part of the C API, PyPy does not necessarily honor them. If you are building for PyPy you should measure memory consumption to make sure you do not have runaway memory growth. See this issue on the PyPy bug tracker.
-- https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration

Do we care about PyPy at all?

We do, though I think we should be fine due to lack of reference cycles.

Comment thread rust/src/events/internal_metadata.rs
Comment thread rust/src/events/json_object.rs Outdated
Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs
Comment thread rust/src/events/serialize.rs

result.push(unescape(&field[prev_start..]));
result
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New logic for split_field(...) compared to previous

def _split_field(field: str) -> list[str]:
"""
Splits strings on unescaped dots and removes escaping.
Args:
field: A string representing a path to a field.
Returns:
A list of nested fields to traverse.
"""
# Convert the field and remove escaping:
#
# 1. "content.body.thing\.with\.dots"
# 2. ["content", "body", "thing\.with\.dots"]
# 3. ["content", "body", "thing.with.dots"]
# Find all dots (and their preceding backslashes). If the dot is unescaped
# then emit a new field part.
result = []
prev_start = 0
for match in SPLIT_FIELD_REGEX.finditer(field):
# If the match is an *even* number of characters than the dot was escaped.
if len(match.group()) % 2 == 0:
continue
# Add a new part (up to the dot, exclusive) after escaping.
result.append(
ESCAPE_SEQUENCE_PATTERN.sub(
_escape_slash, field[prev_start : match.end() - 1]
)
)
prev_start = match.end()
# Add any part of the field after the last unescaped dot. (Note that if the
# character is a dot this correctly adds a blank string.)
result.append(re.sub(r"\\(.)", _escape_slash, field[prev_start:]))
return result

Haven't vetted this. I do see that we added some tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, it does it a different way to avoid regex. Sorry, should have flagged. It's more or less the same idea, find every . character and see if its been escaped or not (taking into account backslash escaping to, so \. is escaped, but \\. is not as it's a literal backslash followed by a dot).

The test cases are exactly the same though.

Comment thread synapse/events/utils.py Outdated
Comment thread synapse/events/utils.py
Comment thread synapse/events/utils.py
Comment on lines +361 to 362
event if not isinstance(event, FilteredEvent) else next(serialized)
for event in events

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this is "fine" because previously we did serialize_event(...) for all events but it just did this:

# To handle the case of presence events and the like
if not isinstance(event, FilteredEvent):
return event

Prior art but using serialize_events(...) with JsonDict seems like the wrong call 🤔

For "presence", it looks like it's talking about this problem spot:

# When the user joins a new room, or another user joins a currently
# joined room, we need to send down presence for those users.
to_add: list[JsonDict] = []
for event in events:
if not isinstance(event, FilteredEvent):
continue
if event.event.type == EventTypes.Member:
if event.event.membership != Membership.JOIN:
continue
# Send down presence.
if event.event.state_key == requester.user.to_string():
# Send down presence for everyone in the room.
users: Iterable[str] = await self.store.get_users_in_room(
event.event.room_id
)
else:
users = [event.event.state_key]
states = await presence_handler.get_states(users)
to_add.extend(
{
"type": EduTypes.PRESENCE,
"content": format_user_presence_state(state, time_now),
}
for state in states
)
events.extend(to_add)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's annoying. I think we also rely on this behaviour at

for event in events:
if not isinstance(event, FilteredEvent):
continue
if event.event.type == EventTypes.Member:
if event.event.membership != Membership.JOIN:
continue
# Send down presence.
if event.event.state_key == requester.user.to_string():
# Send down presence for everyone in the room.
users: Iterable[str] = await self.store.get_users_in_room(
event.event.room_id
)
else:
users = [event.event.state_key]
states = await presence_handler.get_states(users)
to_add.extend(
{
"type": EduTypes.PRESENCE,
"content": format_user_presence_state(state, time_now),
}
for state in states
)
events.extend(to_add)
chunks = await self._event_serializer.serialize_events(
events,
time_now,
config=SerializeEventConfig(
as_client_event=as_client_event, requester=requester
),
)
, not sure if there are other places.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(we're pointing at the same spot)

I think it's only that one spot. Something for another PR ⏩

The new workaround code is even more obtuse which sucks a bit but I guess we will live with it for now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehhh (seeing this again as weird), we should have a FIXME for this.

erikjohnston and others added 9 commits June 18, 2026 13:50
The previous comment said the method returned "a clone … in a
Value::Object", but it actually returns a plain reference to the
underlying BTreeMap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous name was ambiguous — it sounds like a list of field names
to include *or* exclude. "allowlist" makes the direction explicit.
Updates the Rust struct field, the Python-facing getter, and all call
sites.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old name was an implementation-detail label. serialize_event better
describes what the function does: serialize a single event, recursing
into its redactions and bundled aggregations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rather than implicitly updating the config in `serialize_event`, let's
add a helper for creating the config that does the necessary
checks/updates itself.
@erikjohnston erikjohnston force-pushed the erikj/rust_event_serialization branch from 19ba756 to a6c349e Compare June 22, 2026 13:04
Comment thread synapse/events/utils.py
Comment on lines +361 to 362
event if not isinstance(event, FilteredEvent) else next(serialized)
for event in events

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(we're pointing at the same spot)

I think it's only that one spot. Something for another PR ⏩

The new workaround code is even more obtuse which sucks a bit but I guess we will live with it for now.

Comment thread rust/src/events/formats/mod.rs Outdated
Comment on lines +97 to +104
/// The `signatures` and `unsigned` fields are kept separate from the other
/// fields as they are mutable (and must be deep-copied if the event is cloned).
/// `common_fields` and `specific_fields` are both `#[serde(flatten)]`ed so that
/// the serialised JSON is a single flat object matching the Matrix spec.
/// fields as they are mutable. Note the derived [`Clone`] is *shallow*: it
/// shares the mutable `signatures`/`unsigned`/internal state behind their
/// `Arc`s (cheap, and fine for read-only uses such as bundled aggregations).
/// Use [`FormattedEvent::deep_copy`] when an independently-mutable copy is
/// required. `common_fields` and `specific_fields` are both
/// `#[serde(flatten)]`ed so that the serialised JSON is a single flat object
/// matching the Matrix spec.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we do what was previously suggested: "and must be deep-copied if the event is cloned"?

Comment on lines +215 to +218
/// Returns a reference to the underlying map of this object's entries.
pub fn as_map(&self) -> &BTreeMap<Box<str>, Value> {
&self.object
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

Helpful?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah it's become unused now. I think it's likely going to be useful in future, so am minded to leave it in now that we wrote it.

Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs
Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs
Comment thread synapse/events/utils.py

/// A thread's bundled summary: its latest event, the number of events in the
/// thread, and whether the requesting user has participated.
#[pyclass(frozen, skip_from_py_object, get_all)]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference on get_all vs manual getter? We have both here

Comment thread rust/src/events/formats/mod.rs Outdated
Comment on lines +97 to +104
/// The `signatures` and `unsigned` fields are kept separate from the other
/// fields as they are mutable (and must be deep-copied if the event is cloned).
/// `common_fields` and `specific_fields` are both `#[serde(flatten)]`ed so that
/// the serialised JSON is a single flat object matching the Matrix spec.
/// fields as they are mutable. Note the derived [`Clone`] is *shallow*: it
/// shares the mutable `signatures`/`unsigned`/internal state behind their
/// `Arc`s (cheap, and fine for read-only uses such as bundled aggregations).
/// Use [`FormattedEvent::deep_copy`] when an independently-mutable copy is
/// required. `common_fields` and `specific_fields` are both
/// `#[serde(flatten)]`ed so that the serialised JSON is a single flat object
/// matching the Matrix spec.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is FormattedEvent meant to still have Clone? and the docstring is accurate? (hasn't changed)


Feels like using Py<Event> means we're forever tying ourselves to Python when we want this code to stand on its own in a pure Rust codebase.


It's not entirely clear to me if we have to implement __traverse__, since we know there won't be any reference cycles between the relation classes and the Event. Nonetheless I've implemented it to make sure we don't accidentally leak memory.

c.f. https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration and https://docs.python.org/3/c-api/gcsupport.html

To make it more clear, why aren't we also implementing __clear__?

Without __clear__, it seems like the latest_event Event will forever be stuck (can't be GC'ed)


These methods are part of the C API, PyPy does not necessarily honor them. If you are building for PyPy you should measure memory consumption to make sure you do not have runaway memory growth. See this issue on the PyPy bug tracker.

-- https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration

Do we care about PyPy at all?

Comment thread synapse/events/utils.py
Comment on lines +361 to 362
event if not isinstance(event, FilteredEvent) else next(serialized)
for event in events

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehhh (seeing this again as weird), we should have a FIXME for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants