Skip to content

Conversation

@AhmedSoliman
Copy link
Contributor

@AhmedSoliman AhmedSoliman commented Jan 7, 2026

Previously, BackgroundAppender used recv_many() to batch records by count
only (max_batch_size). This change adds byte-size awareness to batch cutting,
ensuring batches do not exceed the configured record_size_limit.

Closes #4132

Changes:

  • Introduce Batch struct to track accumulated bytes alongside operations
  • Add AppendOperation::cost_in_bytes() to calculate record sizes
  • Replace recv_many() with recv()/try_recv() loop that checks both byte
    and count limits before adding to batch
  • Flush batch when adding a new operation would exceed limits
  • Use cancellation_token() instead of cancellation_watcher() for drain

The batch is cut when either:

  • Adding the next record would exceed batch_limit_bytes
  • The batch reaches max_batch_size count

Stack created with Sapling. Best reviewed with ReviewStack.

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

Test Results

 5 files   -   2   5 suites   - 2   1m 14s ⏱️ - 2m 7s
34 tests  -  13  34 ✅  -  13  0 💤 ±0  0 ❌ ±0 
52 runs   - 148  52 ✅  - 148  0 💤 ±0  0 ❌ ±0 

Results for commit d4a4d9c. ± Comparison against base commit 81a5288.

This pull request removes 47 and adds 34 tests. Note that renamed tests count towards both.
dev.restate.sdktesting.tests.CallOrdering ‑ ordering(boolean[], Client)[1]
dev.restate.sdktesting.tests.CallOrdering ‑ ordering(boolean[], Client)[2]
dev.restate.sdktesting.tests.CallOrdering ‑ ordering(boolean[], Client)[3]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromAdminAPI(BlockingOperation, Client, URI)[1]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromAdminAPI(BlockingOperation, Client, URI)[2]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromAdminAPI(BlockingOperation, Client, URI)[3]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromContext(BlockingOperation, Client)[1]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromContext(BlockingOperation, Client)[2]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromContext(BlockingOperation, Client)[3]
dev.restate.sdktesting.tests.Combinators ‑ awakeableOrTimeoutUsingAwakeableTimeoutCommand(Client)
…
dev.restate.sdktesting.tests.AwakeableIngressEndpointTest ‑ completeWithFailure(Client)
dev.restate.sdktesting.tests.AwakeableIngressEndpointTest ‑ completeWithSuccess(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$NewVersion ‑ completeAwakeable(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$NewVersion ‑ completeRetryableOperation(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$NewVersion ‑ proxyCallShouldBeDone(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$NewVersion ‑ proxyOneWayCallShouldBeDone(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$OldVersion ‑ createAwakeable(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$OldVersion ‑ startOneWayProxyCall(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$OldVersion ‑ startProxyCall(Client)
dev.restate.sdktesting.tests.BackwardCompatibilityTest$OldVersion ‑ startRetryableOperation(Client)
…

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

Test Results

  7 files  ± 0    7 suites  ±0   5m 7s ⏱️ + 1m 52s
 49 tests + 2   49 ✅ + 2  0 💤 ±0  0 ❌ ±0 
210 runs  +10  210 ✅ +10  0 💤 ±0  0 ❌ ±0 

Results for commit efa7924. ± Comparison against base commit e1bc1d4.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @AhmedSoliman. Thanks a lot for making our batching logic aware of byte limits so that we don't generate too large batches! LGTM. +1 for merging :-)

let token = sender.notify_committed().await?;
token.await?;

handle.drain().await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this important for the below assertion to succeed? Or would it be enought to await the token and then to the find_tail operation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's superfluous.

@AhmedSoliman AhmedSoliman force-pushed the pr4144 branch 2 times, most recently from e3c3aca to efa7924 Compare January 8, 2026 10:03
Add a record size limit check at append time in Bifrost to validate that
individual records do not exceed a configured maximum size.

Important note: This configuration option is not going to be effective without implementing size estimation of records. At the moment, all typed records are assumed to be 2048 bytes in size which makes this check useless. Nevertheless,
This check is useful for the future when we implement size estimation of records.

Changes:
- Add `bifrost.record-size-limit` configuration option that defaults to `networking.message-size-limit` (32 MiB) and is clamped to that value
- Add `BatchTooLarge/RecordTooLarge` error variants to get notified when a record too large or when a batch is too large depending on whether you're using Appender or BackgroundAppender.
- Add record size validation to all `LogSender` enqueue methods in `BackgroundAppender` to fail fast at enqueue time

This prevents oversized records from being written to the log, which could cause issues during replication and network transmission.

Part of #4130, #4132
The goal is to make sure that the user can still set the VO state to a value within the message size limit without failing.

For instance, using the large-state-service, with this change, we can do:
```
curl http://localhost:8080/LargeState/224/state --silent --json '50000000'
"State set"%
```
While running the server with ```RESTATE_NETWORKING__MESSAGE_SIZE_LIMIT=50000000```


Related to #4130
…batches

Previously, BackgroundAppender used recv_many() to batch records by count
only (max_batch_size). This change adds byte-size awareness to batch cutting,
ensuring batches do not exceed the configured record_size_limit.

Closes #4132

Changes:
- Introduce Batch struct to track accumulated bytes alongside operations
- Add AppendOperation::cost_in_bytes() to calculate record sizes
- Replace recv_many() with recv()/try_recv() loop that checks both byte
  and count limits before adding to batch
- Flush batch when adding a new operation would exceed limits
- Use cancellation_token() instead of cancellation_watcher() for drain

The batch is cut when either:
- Adding the next record would exceed batch_limit_bytes
- The batch reaches max_batch_size count
@AhmedSoliman AhmedSoliman merged commit 4ac7a73 into main Jan 8, 2026
14 of 16 checks passed
@AhmedSoliman AhmedSoliman deleted the pr4144 branch January 8, 2026 10:28
@github-actions github-actions bot locked and limited conversation to collaborators Jan 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make Bifrost batching aware of message size limits

3 participants