[Feature] Native health/readiness endpoint for K8s style liveness & readiness probes

### Describe the feature

### Motivation
I run reth as an RPC node behind K8s, and I need a readiness signal that's separate from liveness: not "is the process up" but "is this node caught up enough to return correct results".
The case that bites: a node that just restarted (or is still catching up) has its HTTP server up and answers requests fine, but it's behind the head. If it stays in the service rotation, clients get stale reads, e.g., eth_getTransactionByHash returns null for a tx the rest of the network already has. The usual solution is a readiness probe that pulls the node out of rotation until it catches up, without restarting it.
Right now **Reth** exposes the JSON-RPC API and the Prometheus metrics endpoint, but there's nothing a probe can hit for a plain 200/503 readiness answer. So everyone ends up running a small sidecar that calls eth_getBlockByNumber("latest"), compares the block timestamp to wall clock, and maps that to a status code. It works, but it's the same glue rewritten by every operator, and metrics aren't really the right thing to gate traffic on.

### Additional context


###  Proposal
An optional, unauthenticated HTTP endpoint (/health or /livez + /readyz) that returns 200 when ready and 503 otherwise, with a short body saying which check failed.
The part I'd want to get right is keeping the freshness threshold operator-configurable rather than hardcoded, since "how stale is too stale" depends on the chain (L1 ~12s, OP Stack ~2s, and so on). Either approach works:

header-driven, Erigon-style: the caller passes thresholds per request (max_seconds_behind, min_peer_count, ...)
flag-driven: thresholds set once at startup

Checks I'd find useful (all optional):

max_seconds_behind: age of the latest block timestamp vs now. This is the main freshness signal. One thing to get right: read the timestamp via the normal eth_getBlockByNumber("latest") path, not an internal/snapshot source, to avoid the 0-timestamp bug Erigon hit (see below).
min_peer_count
a trivial process-liveness check for /livez

### Prior art

Erigon does this with X-ERIGON-HEALTHCHECK headers (max_seconds_behind, min_peer_count, synced, check_block). It also has a known bug, erigontech/erigon#9357, where the check read the timestamp from an internal source and got 0, which is why I mention reading it from the normal path above.
ChainSafe Forest added /livez and /readyz with a ?verbose flag that lists each check: ChainSafe/forest#3949.

### Where it should live
I think this belongs in core reth's RPC layer as a generic, chain-agnostic feature, so downstreams on the Reth SDK (op-reth included) pick it up through the normal port process instead of carrying a separate patch. Any downstream-specific wiring, like exposing an enable flag in op-reth's CLI, can be added there at port time; the core only gets written once.
One limitation to call out: for rollups the EL can't know on its own how far it is behind the sequencer's unsafe head, since that lives in the consensus layer (op-node's optimism_syncStatus). So this endpoint can cover block-age and peer-count freshness, but a full "caught up to the sequencer" check is out of scope here.
Scope

generic and chain-agnostic in core reth, no chain-specific freshness constants
thresholds operator-configurable, not opinionated defaults that gate traffic
No consensus layer / sequencer-head state in core reth
rollup-specific cross-checks (e.g. optimism_syncStatus) stay downstream

### Questions

Is there appetite for this in core Reth, or do you consider readiness gating an operator-side (sidecar) concern?
If there's interest: single /health with header-driven checks, or /livez + /readyz with startup-configured thresholds?
Where in the RPC layer should it sit, and should it be behind its flag or port (e.g., --http.health)?

Happy to implement it if there's interest and a rough agreement on the shape.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Native health/readiness endpoint for K8s style liveness & readiness probes #24640

Describe the feature

Motivation

Additional context

Proposal

Prior art

Where it should live

Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature] Native health/readiness endpoint for K8s style liveness & readiness probes #24640

Description

Describe the feature

Motivation

Additional context

Proposal

Prior art

Where it should live

Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions