Skip to content

Commit 81bad95

Browse files
committed
MSC4280: Interactive /rooms/ROOM_ID/messages (Client-Server API)
1 parent 8d2fb67 commit 81bad95

File tree

1 file changed

+91
-0
lines changed

1 file changed

+91
-0
lines changed
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# MSC4280: Hint that a /rooms/{room_id}/messages request is interactive
2+
3+
The endpoint [/rooms/{room_id}/messages](https://spec.matrix.org/latest/client-server-api/#get_matrixclientv3roomsroomidmessages)
4+
is used by clients to retrieve older events from a homeserver, when the direction is set to
5+
backwards (a phenomenon also called "back-pagination" throughout this MSC). This can be useful in a
6+
few contexts:
7+
8+
- after a gappy sync (i.e. that set the `limited` flag), so as to retrieve events included in the
9+
gap, that is, all the events not included in the last sync response, and that have been sent to the
10+
homeserver after the last time we've sync'd. This applies both to sync v2 and simplified sliding
11+
sync.
12+
- as an out-of-sync mechanism to go through all the events in a room from the end to the start, so
13+
as to apply some mass operation on them, like indexing them for a search engine.
14+
15+
In fact, this mechanism is crucial in the context of [simplified sliding sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186).
16+
This sync mechanism indeed generates thin server responses including a minimal set of events
17+
(controlled by the `timeline_limit` request parameter), so as to provide better initial sync times
18+
and ultimately more responsive clients. The client is then expected to use the
19+
`/rooms/{room_id}/messages` endpoint to retrieve the previous events of a room.
20+
21+
As a result, clients should be able to expect this endpoint to be *fast*, when the user session is
22+
interactive (i.e. a user is waiting for these events to be retrieved). While it's hard to define
23+
*how* fast, it's expected that this endpoint would return in a matter of seconds, in the worst
24+
cases. Otherwise, the user experience on the clients may be severely degraded.
25+
26+
However, some server implementations, including
27+
[Synapse](https://github.com/element-hq/synapse/blob/5c84f258095535aaa2a4a04c850f439fd00735cc/synapse/handlers/pagination.py#L575-L584),
28+
[Conduit](https://gitlab.com/famedly/conduit/-/blob/a7e6f60b41122761422df2b7bcc0c192416f9a28/src/api/client_server/message.rs#L201)
29+
and
30+
[Conduwuit](https://github.com/girlbossceo/conduwuit/blob/0f81c1e1ccdcb0c5c6d5a27e82f16eb37b1e61c8/src/api/client/message.rs#L94-L101),
31+
may generate, under some implementation-specific conditions, federation requests to
32+
[backfill](https://spec.matrix.org/v1.14/server-server-api/#backfilling-and-retrieving-missing-events)
33+
the room timeline, and fetch more events from other servers. This slows down reception of the
34+
response in the client, since it now be blocking on the server waiting for the federation responses
35+
to come. Moreover, the time spent retrieving those responses is theoretically unbounded, so the
36+
homeserver and the clients may have to wait forever for such requests to complete.
37+
38+
We need a more responsive way to fetch older events from the server, without having to wait for
39+
federation responses to come back. This is the *raison d'être* of this MSC.
40+
41+
## Proposal
42+
43+
It is proposed that the `/rooms/{room_id}/messages` endpoint be modified to allow clients to
44+
specify a new boolean query parameter `interactive`, which indicates that the client is interested
45+
in getting the response *quickly*.
46+
47+
If the parameter is missing, then it's considered to be `false` by default. Thus, this is not a
48+
semantics breaking change, in that the server behavior will remain the same if the query parameter
49+
hasn't been set.
50+
51+
When the query parameter is set to `true`, then the server is expected to do a best-effort attempt
52+
at providing a response *in a reasonably short time*. Implementations may use one of the following
53+
strategies to achieve this:
54+
55+
- avoid blocking on a backfill request to other homeservers, by not starting such requests at all,
56+
or by starting them in the background in a non-blocking way.
57+
- start the backfill request, and race between waiting for its completion and timing out after a
58+
short amount of time. This can be a nice tradeoff in case backfill requests resolve quickly.
59+
- not do anything differently. This doesn't solve the problem, but the query parameter really is a
60+
hint that the response is expected to come in quickly, not a strong requirement.
61+
- do something completely different, not mentioned in this MSC, that achieves the same goal.
62+
63+
## Potential issues
64+
65+
Before, it was possible that clients would miss events in a room, because they back-paginated
66+
through it using `/messages`, and the server received new events after a netsplit, at a position that
67+
the client had already paginated through. This would result in the client not receiving those
68+
events, or receiving them through sync but in a non-topological ordering (i.e. an ordering that
69+
would be different that the one they would've observed by paginating with `/messages`).
70+
71+
This MSC doesn't resolve this problem, and it may make it more apparent on the contrary, if *all*
72+
`/messages` requests end up *not* causing any federation backfill. The most likely consequence of
73+
this is that events might be more frequently misordered across clients.
74+
75+
## Alternatives
76+
77+
Instead of an additional query parameter, this MSC could mandate that this becomes the expected
78+
behavior of all the implementations. This would be an implicit breaking change, and it may inhibit
79+
use cases where clients might prefer a perfectly backfilled room over a quick response time.
80+
81+
Since this problem is more frequent with simplified sliding sync, one could imagine that a client
82+
would find a simplified-sliding-sync specific solution. For instance, it could increase the
83+
`timeline_limit` window to get more and more events from the end of the room, up to the previous
84+
latest event they knew about, and thus *not* cause backfill requests. This is a workaround that
85+
would work, but not be optimal in terms of bandwidth and server CPU activity, as it would mean
86+
including lots of events the client has already seen before (viz., the increasing tail of the
87+
room's timeline).
88+
89+
We could also have a new separate paginated endpoint to retrieve the previous events in the *sync*
90+
ordering, thus not causing any backfill requests. It would be strictly more work to implement, and
91+
it is unclear that it would achieve more than the current proposal.

0 commit comments

Comments
 (0)