Summary
When a new region joins a full-replica cache mesh, it should be able to copy the existing dataset from one or more healthy peers and then switch to live replication without serving stale or partial data.
Why
Today the outbox only guarantees delivery to peers that were already configured as replication targets when the write happened. A newly added region needs an explicit bootstrap flow.
Scope
- define a bootstrap state machine for joining nodes
- stream existing manifests and blobs from healthy peers
- add a replication watermark/checkpoint so the node can switch from snapshot copy to live catch-up safely
- avoid serving incomplete data until bootstrap is complete, or serve with a degraded/read-through mode explicitly
- add e2e coverage for node join after existing data is already present
Notes
This is specifically for the full-replica design where every region should eventually hold every object.
Summary
When a new region joins a full-replica cache mesh, it should be able to copy the existing dataset from one or more healthy peers and then switch to live replication without serving stale or partial data.
Why
Today the outbox only guarantees delivery to peers that were already configured as replication targets when the write happened. A newly added region needs an explicit bootstrap flow.
Scope
Notes
This is specifically for the full-replica design where every region should eventually hold every object.