Summary
The validator and relayer always send checkpoint S3 traffic to real AWS, even when configured for an S3-compatible store. Two compounding bugs are in play. First, --checkpointSyncer.endpoint is parsed but never reaches the SDK. Second, the standard AWS escape hatch AWS_ENDPOINT_URL_S3 is also ignored, because the aws-config dependency pin predates the version that added support for it. The result is InvalidAccessKeyId from real AWS S3, and the validator panics on its first report_agent_metadata call.
Reproduction
Run gcr.io/abacus-labs-dev/hyperlane-agent:main with --checkpointSyncer.endpoint=https://nyc3.digitaloceanspaces.com and DO Spaces credentials in AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY. Setting AWS_ENDPOINT_URL_S3 to the same URL does not change the behavior.
thread 'main' panicked at agents/validator/src/validator.rs:287:14:
Failed to report agent metadata: service error
Caused by:
1: Error { code: "InvalidAccessKeyId", aws_request_id: "<AWS-format-id>" }
Location: hyperlane-base/src/types/s3_storage.rs:67:9
The AWS-format aws_request_id confirms the request reached real AWS, not the configured endpoint.
Root cause
Bug 1 — the CLI flag is dropped at the SDK boundary. S3Storage carries only bucket, region, and folder fields, and it builds the SDK config with .region(...) only. .endpoint_url(...) is never called, so the parsed CLI value is discarded.
Bug 2 — the documented env-var fallback can't save you. rust/main/Cargo.toml pins aws-config = "1.1.7", released February 2024. Support for service-specific endpoint env vars like AWS_ENDPOINT_URL_S3 landed in aws-config 1.2 via smithy-rs#3568, two months later. As a result, ConfigLoader::default() in this build silently ignores the variable.
Either fix alone would unblock every S3-compatible store, but currently neither does.
Proposed fix
- Bump
aws-config = "1.1.7" → "1.5" in rust/main/Cargo.toml. This is a one-line change that makes AWS_ENDPOINT_URL_S3 work as documented.
- Add
endpoint: Option<String> to S3Storage and wire --checkpointSyncer.endpoint through to .endpoint_url(...). This is roughly 20 lines and makes the CLI flag work as the docs imply.
- Optionally add
--checkpointSyncer.forcePathStyle for backends that require path-style addressing (Cloudflare R2, MinIO defaults).
Workaround
A re-signing reverse-proxy sidecar inside the validator/relayer pod lets the agent keep talking to an S3-compatible backend without any code change to Hyperlane. The pod's hostAliases route the AWS S3 hostnames the agent uses (both the bare service host and the bucket-prefixed virtual-host form) to 127.0.0.1, where the sidecar listens on :443. The sidecar terminates TLS with a server certificate signed by a locally-generated CA, and an init container writes both the cert and a merged CA bundle into a shared emptyDir so the agent container can mount the bundle over /etc/ssl/certs/ca-certificates.crt via a subPath mount — AWS_CA_BUNDLE is also unsupported in this build, so the system trust store is the only path that works. For each request, the sidecar strips the AWS SigV4 authorization headers, rewrites the host to its DO Spaces equivalent (for example <bucket>.s3.us-east-1.amazonaws.com becomes <bucket>.nyc3.digitaloceanspaces.com), re-signs with the same credentials, and forwards. The re-signing is straightforward with github.com/aws/aws-sdk-go-v2/aws/signer/v4's SignHTTP; the only non-obvious detail is that you have to set the X-Amz-Content-Sha256 header explicitly before signing, otherwise DO Spaces rejects the signature.
The relevant pod-spec wiring looks like this:
spec:
hostAliases:
- ip: "127.0.0.1"
hostnames:
- "s3.us-east-1.amazonaws.com"
- "s3.amazonaws.com"
- "<bucket>.s3.us-east-1.amazonaws.com"
- "<bucket>.s3.amazonaws.com"
initContainers:
- name: gen-cert
image: <sigproxy-image>
args: ["--mode=init", "--cert-dir=/shared"]
volumeMounts:
- { name: sigproxy-shared, mountPath: /shared }
containers:
- name: sigproxy
image: <sigproxy-image>
args: ["--mode=serve", "--cert-dir=/shared",
"--upstream=https://nyc3.digitaloceanspaces.com"]
env:
- { name: AWS_ACCESS_KEY_ID, valueFrom: { secretKeyRef: { name: do-spaces, key: AWS_ACCESS_KEY_ID } } }
- { name: AWS_SECRET_ACCESS_KEY, valueFrom: { secretKeyRef: { name: do-spaces, key: AWS_SECRET_ACCESS_KEY } } }
volumeMounts:
- { name: sigproxy-shared, mountPath: /shared, readOnly: true }
- name: validator
image: gcr.io/abacus-labs-dev/hyperlane-agent:main
# ... usual validator args ...
volumeMounts:
- name: sigproxy-shared
mountPath: /etc/ssl/certs/ca-certificates.crt
subPath: ca-bundle.crt
readOnly: true
volumes:
- name: sigproxy-shared
emptyDir: { medium: Memory, sizeLimit: 1Mi }
Environment
- Image:
hyperlane-agent:main (commit c558a9f)
aws-config 1.1.7, aws-sdk-s3 1.65.0, aws-smithy-runtime 1.8.1
- Backend: DigitalOcean Spaces (
nyc3)
Happy to send a PR — the dep bump alone is one line. Let me know which scope you'd prefer.
Summary
The validator and relayer always send checkpoint S3 traffic to real AWS, even when configured for an S3-compatible store. Two compounding bugs are in play. First,
--checkpointSyncer.endpointis parsed but never reaches the SDK. Second, the standard AWS escape hatchAWS_ENDPOINT_URL_S3is also ignored, because theaws-configdependency pin predates the version that added support for it. The result isInvalidAccessKeyIdfrom real AWS S3, and the validator panics on its firstreport_agent_metadatacall.Reproduction
Run
gcr.io/abacus-labs-dev/hyperlane-agent:mainwith--checkpointSyncer.endpoint=https://nyc3.digitaloceanspaces.comand DO Spaces credentials inAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY. SettingAWS_ENDPOINT_URL_S3to the same URL does not change the behavior.The AWS-format
aws_request_idconfirms the request reached real AWS, not the configured endpoint.Root cause
Bug 1 — the CLI flag is dropped at the SDK boundary.
S3Storagecarries onlybucket,region, andfolderfields, and it builds the SDK config with.region(...)only..endpoint_url(...)is never called, so the parsed CLI value is discarded.Bug 2 — the documented env-var fallback can't save you.
rust/main/Cargo.tomlpinsaws-config = "1.1.7", released February 2024. Support for service-specific endpoint env vars likeAWS_ENDPOINT_URL_S3landed inaws-config 1.2via smithy-rs#3568, two months later. As a result,ConfigLoader::default()in this build silently ignores the variable.Either fix alone would unblock every S3-compatible store, but currently neither does.
Proposed fix
aws-config = "1.1.7"→"1.5"inrust/main/Cargo.toml. This is a one-line change that makesAWS_ENDPOINT_URL_S3work as documented.endpoint: Option<String>toS3Storageand wire--checkpointSyncer.endpointthrough to.endpoint_url(...). This is roughly 20 lines and makes the CLI flag work as the docs imply.--checkpointSyncer.forcePathStylefor backends that require path-style addressing (Cloudflare R2, MinIO defaults).Workaround
A re-signing reverse-proxy sidecar inside the validator/relayer pod lets the agent keep talking to an S3-compatible backend without any code change to Hyperlane. The pod's
hostAliasesroute the AWS S3 hostnames the agent uses (both the bare service host and the bucket-prefixed virtual-host form) to127.0.0.1, where the sidecar listens on:443. The sidecar terminates TLS with a server certificate signed by a locally-generated CA, and an init container writes both the cert and a merged CA bundle into a sharedemptyDirso the agent container can mount the bundle over/etc/ssl/certs/ca-certificates.crtvia a subPath mount —AWS_CA_BUNDLEis also unsupported in this build, so the system trust store is the only path that works. For each request, the sidecar strips the AWS SigV4 authorization headers, rewrites the host to its DO Spaces equivalent (for example<bucket>.s3.us-east-1.amazonaws.combecomes<bucket>.nyc3.digitaloceanspaces.com), re-signs with the same credentials, and forwards. The re-signing is straightforward withgithub.com/aws/aws-sdk-go-v2/aws/signer/v4'sSignHTTP; the only non-obvious detail is that you have to set theX-Amz-Content-Sha256header explicitly before signing, otherwise DO Spaces rejects the signature.The relevant pod-spec wiring looks like this:
Environment
hyperlane-agent:main(commitc558a9f)aws-config 1.1.7,aws-sdk-s3 1.65.0,aws-smithy-runtime 1.8.1nyc3)Happy to send a PR — the dep bump alone is one line. Let me know which scope you'd prefer.