feat(substrate): bump agent substrate to 0.0.8#2140
Conversation
ffc7635 to
cb7e67d
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates kagent to work with substrate v0.0.8, adapting all actor lifecycle call sites to the new atespace-scoped identity model, updating atenet-router host header formatting, and improving local/helm dev ergonomics around substrate installation.
Changes:
- Bump substrate dependency to
v0.0.8and refactor actor operations to use(atespace, actorID)/ActorRef. - Update atenet-router Host/DNS shape to include atespace as a DNS label, propagated through gateway + A2A transports.
- Add helm/build/dev improvements (templated substrate chart repo, kind registry env overrides, optional insecure curl for digest resolution).
Reviewed changes
Copilot reviewed 29 out of 30 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/kind/setup-kind.sh | Add env overrides for local registry name/port/scheme and adjust containerd hosts config + registry documentation. |
| scripts/controller-digest-ldflags.sh | Add optional DIGEST_CURL_INSECURE to skip TLS verification when resolving image digests. |
| Makefile | Add SUBSTRATE_REPO override and pass it through chart stamping. |
| helm/kagent/Chart-template.yaml | Template substrate chart repository. |
| helm/kagent-crds/Chart-template.yaml | Template substrate-crds chart repository. |
| go/go.sum | Update substrate to v0.0.8 and refresh indirect deps. |
| go/go.mod | Replace substrate to v0.0.8 and bump indirect deps. |
| go/core/pkg/sandboxbackend/substrate/openclaw.go | Pass atespace through actor calls and include it in backend Handle; update ActorHost format. |
| go/core/pkg/sandboxbackend/substrate/openclaw_test.go | Update ActorHost test for new DNS shape. |
| go/core/pkg/sandboxbackend/substrate/lifecycle_delete.go | Delete golden actors using the reserved ate-golden atespace. |
| go/core/pkg/sandboxbackend/substrate/lifecycle_delete_test.go | Update test client to the new ActorRef API and add atespace RPC stubs. |
| go/core/pkg/sandboxbackend/substrate/lifecycle_actortemplate.go | Explicitly set SnapshotsConfig defaults to avoid reconcile drift loops. |
| go/core/pkg/sandboxbackend/substrate/gateway.go | Require atespace and include it in router Host construction. |
| go/core/pkg/sandboxbackend/substrate/gateway_test.go | Update gateway target tests for new atespace-aware host format and validations. |
| go/core/pkg/sandboxbackend/substrate/delete_actor.go | Thread atespace through delete/suspend/resume paths. |
| go/core/pkg/sandboxbackend/substrate/delete_actor_test.go | Update deleteActor test signature for atespace parameter. |
| go/core/pkg/sandboxbackend/substrate/client.go | Switch to ActorRef in RPCs; add EnsureAtespace helper; update method signatures. |
| go/core/pkg/sandboxbackend/substrate/agentharness_actor.go | Update AgentHarness actor ops for atespace-scoped identity and handles. |
| go/core/pkg/sandboxbackend/substrate/agent_lifecycle.go | Set SnapshotsConfig defaults explicitly for SandboxAgent ActorTemplates. |
| go/core/pkg/sandboxbackend/substrate/agent_actor.go | Update SandboxAgent actor ops for atespace-scoped identity and handles. |
| go/core/pkg/sandboxbackend/substrate/actor_reachability.go | Thread atespace through reachability checks and router targeting. |
| go/core/pkg/sandboxbackend/substrate/actor_reachability_test.go | Update tests for new Host/DNS shape and new atespace parameter. |
| go/core/pkg/sandboxbackend/async.go | Extend backend Handle to carry Atespace for substrate backends. |
| go/core/internal/httpserver/handlers/substrate.go | Add Atespace to /api/substrate/status actor entries. |
| go/core/internal/httpserver/handlers/agentharness_gateway.go | Resolve gateway target using atespace and log it. |
| go/core/internal/httpserver/handlers/agentharness_gateway_test.go | Update ACP proxy tests for atespace-in-host format. |
| go/core/internal/a2a/substrate_transport.go | Update substrate round-tripper creation to include atespace. |
| go/core/internal/a2a/substrate_sandbox_transport.go | Pass atespace into substrate round-tripper when proxying A2A via atenet-router. |
| go/api/httpapi/substrate.go | Add atespace field to SubstrateActorEntry JSON model. |
| examples/substrate-openclaw/README.md | Document atelet image pull args for kind/local registries and subchart prefixing. |
Comments suppressed due to low confidence (1)
scripts/kind/setup-kind.sh:23
- REG_SCHEME can be set to https, but when the registry container is created by this script it runs
registry:2with no TLS. If a user sets REG_SCHEME=https and the container doesn’t already exist, containerd will be configured to use HTTPS against an HTTP registry and pulls will fail. Consider rejecting non-http schemes when bootstrapping a new registry container (or force reg_scheme=http in that branch).
# Override REG_NAME / REG_PORT / REG_SCHEME to reuse an existing local registry
# (e.g. an HTTPS registry on another port) instead of creating a fresh kind-registry.
reg_name="${REG_NAME:-kind-registry}"
reg_port="${REG_PORT:-5001}"
reg_scheme="${REG_SCHEME:-http}"
if [ "$("${CONTAINER_RUNTIME}" inspect -f '{{.State.Running}}' "${reg_name}" 2>/dev/null || true)" != 'true' ]; then
"${CONTAINER_RUNTIME}" run \
-d --restart=always -p "127.0.0.1:${reg_port}:5000" --network bridge --name "${reg_name}" \
registry:2
fi
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cb7e67d to
692abdf
Compare
Adapts kagent for substrate v0.0.8's atespace-scoped ActorRef identity
model (rename of ActorId→ActorRef{Atespace,Name} on all actor RPCs). Maps
atespace 1:1 to the SandboxAgent/AgentHarness Kubernetes namespace, adds
an EnsureAtespace idempotent helper, and updates the atenet-router Host
header shape to include the atespace label.
Also fixes a pre-existing kagent bug that PR kagent-dev#2109's ActorTemplate spec
immutability change surfaced: SnapshotsConfig.{OnPause,OnCommit} were
left zero-value in kagent's desired spec but the API server defaults
them to "Full" on admission, causing apiequality.Semantic.DeepEqual to
report drift every reconcile and hot-loop delete/recreate the
ActorTemplate CR.
Verified end-to-end on colima+kind with substrate v0.0.8 published
charts: SandboxAgent (declarative Go) and AgentHarness (openclaw) both
reach Ready=True and chat round-trip works.
Signed-off-by: Jonathan Jamroga <jjamroga@gmail.com>
692abdf to
d40449e
Compare
|
Addressed both Copilot review comments in d40449e6:
|
Summary
Bumps kagent's substrate dependency to substrate
v0.0.8and updates every actor call site for the new atespace-scoped identity model.Code changes
ActorRefrefactor — substrate v0.0.8 replaces the flatActorId stringfield onGet/Create/Resume/Suspend/DeleteActorRequest(and the newPauseActorRequest) withActorRef{Atespace, Name}. Every method onsubstrate.Clientgrew anatespaceparameter, and every caller was updated to pass it through. Introduces anEnsureAtespacehelper that idempotently creates an atespace before the first actor is created in it (substrate returnsFailedPrecondition: Atespace X not foundotherwise).ate-goldenatespace.sandboxbackend.Handlegained anAtespacefield so cross-reconcile lookups (GetStatus,DeleteAgentHarness) can address the actor.<name>.<atespace>.actors.resources.substrate.ate.dev.ActorHostandGatewayRouterTargetsignatures updated accordingly, propagated through the ACP proxy handler and the A2A round-tripper.SnapshotsConfigdefaults — bug fix surfaced by PR feat(substrate): bump agent substrate to 0.0.7 #2109 makingActorTemplate.specimmutable. Kagent'sdesiredspec leftOnPause/OnCommitas zero-value empty strings while the API server defaulted them toFullon admission, soapiequality.Semantic.DeepEqual(existing, desired)reported drift on every reconcile, causing an infinite delete/recreate loop on theActorTemplateCR. Kagent now sets both fields toSnapshotScopeFullexplicitly inagent_lifecycle.go(SandboxAgent) andlifecycle_actortemplate.go(AgentHarness).SubstrateActorEntry.Atespaceadded to the/api/substrate/statusresponse so the UI can surface the new field.Helm / build ergonomics
SUBSTRATE_REPOoverride —Makefile+helm/kagent/Chart-template.yaml+helm/kagent-crds/Chart-template.yamlnow template the substrate subchart repository, so local dev can point at a self-published chart without hand-editing.scripts/kind/setup-kind.sh— env-drivenREG_NAME/REG_PORT/REG_SCHEME/REG_INTERNAL_PORTso an already-running local registry (HTTPS, non-standard port, etc.) can be reused instead of unconditionally creatingkind-registry.scripts/controller-digest-ldflags.sh—DIGEST_CURL_INSECURE=trueenv var to skip TLS verification when the digest resolver hits a self-signed local registry.Verified locally
Full E2E on colima+kind+substrate v0.0.8 published charts:
SandboxAgent(declarative,runtime: go) reachesReady=True, chat round-trip through the config-hashed session-actor path works.AgentHarness(backend: openclaw) reachesReady=True, chat via the shared-actor ACP path works.Local E2E validation (substrate v0.0.8)
Verified against substrate v0.0.8's published OCI chart — no local substrate build required.
Prerequisites
--vm-type vzon Apple Silicon. Docker Desktop's linuxkit kernel breaks gVisor checkpoint/restore on arm64.127.0.0.1:5001with its own registry the moment it launches, and everydocker push localhost:5001/...from your Mac hits Docker Desktop's stale registry instead of colima's kind-registry.OPENAI_API_KEYexported in your shell.1. Kind cluster + local registry
Creates a
kagentkind cluster, akind-registrycontainer (host port 5001 → container port 5000), and installs a containerdhosts.tomlon the node aliasinglocalhost:5001→http://kind-registry:5000so kubelet can pull our locally-built images.2. Buildx builder on the
kindnetworkBuildx's default
--driver-opt network=hostcan't reachkind-registryby name. Recreate the builder on thekinddocker network with a buildkitd config that marks the local registries HTTP so pushes don't try HTTPS:3. Build + push kagent images
This will fail at the
build-controllerstep becausecontroller-digest-ldflags.shruns on the host, wherekind-registrydoesn't resolve. Everything up to and including the sandbox images is pushed successfully. Re-run just the controller step against the host-reachablelocalhost:5001alias (same registry, content-addressed digests are identical):4. Install kagent CRDs (with substrate CRDs enabled)
5. Install kagent + substrate v0.0.8
Pulls the substrate subchart from
oci://ghcr.io/kagent-dev/substrate/helm(the default) at the version pinned ingo/go.mod. No local substrate build.Two overrides worth calling out:
substrate.atelet.extraArgs=[--localhost-registry-replacement=kind-registry:5000]Atelet uses
go-containerregistrydirectly to pull the ActorTemplate container image (not containerd), so containerd'shosts.tomlalias on the kind node doesn't apply to it. This flag tells atelet's puller: for any ref starting withlocalhost:*, rewrite the hostname tokind-registry:5000(which resolves inside pods via cluster DNS on the kind docker network) AND parse it withname.Insecureso the fetch is plain HTTP. Both effects come from that single flag.controller.substrate.ateApiServer.{namespace,serviceAccount}—Substrate is deployed as a subchart of kagent, so
substrate.fullnameprefixes every resource with the release name. The ate-api-server SA ends up askagent-ate-api-serverin thekagentnamespace (not the defaultate-api-serverinate-system). Kagent'ssubstrate-ate-api-rbac.yamlneeds the correct namespace + SA to bind its secret-read Role forKAGENT_CONFIG_JSONenv resolution duringCallAteletRestore.6. Apply the default gVisor
SandboxConfigSubstrate ships the CRD but not a default
SandboxConfigresource. Without one,ate-controllerfails resume withno default SandboxConfig for class "gvisor":7. Verify with a SandboxAgent
8. Verify with an AgentHarness
9. Chat via the UI
kubectl --context kind-kagent port-forward -n kagent svc/kagent-ui 3000:8080 # open http://localhost:3000 → pick substrate-demo or harness-demo → send a message