Status: Living
Last updated: 2026-06-14
- Related: llm-provider-seam.md, tool-registry.md, run-plan.md, built-in-tools.md, ../contracts/sse-event-schema.md, ../../decisions/0038-agentrunner-llm-call-boundary.md, ../../decisions/0039-same-provider-reasoning-replay.md, ../../decisions/0036-run-loop-substrate-event-bus-and-execution-host.md, ../../decisions/0037-engine-tool-execution-boundary.md, ../../standards/error-handling.md
The AgentRunner is the single dispatching NodeExecutor (ADR-0036) the run loop holds. It runs an agent vertex's LLM turn(s) end to end against the @relavium/llm seam, through the ToolRegistry, and returns one NodeOutcome. This page is the canonical home for its injection boundary and turn contract; the decision is ADR-0038.
| Layer | Where | Concern |
|---|---|---|
| Turn core | internal (engine/agent-turn.ts) |
A correlation-key-agnostic driver: assemble → chain.stream → fold the stream into agent:* events → tool-call loop → settle. It takes messages + tools + the fallback plan + emit + signal + nodeId + the registry + limits, and emits envelope-less event bodies. No NodeExecContext, no runId/sessionId. AgentSession (1.V) reuses it unchanged (ADR-0024/0025/0026) — its parameter shape is a frozen internal contract, not on the public surface. |
| Dispatching adapter | exported (engine/agent-runner.ts) |
createAgentNodeExecutor(deps) → the NodeExecutor. Switches on ctx.vertex.type: an agent vertex runs the turn core; every non-agent type is a loud, typed failed stub (internal) until the 1.P handlers land. Owns the run-path concerns the core excludes (below). |
The host injects only platform capabilities; the credential is threaded opaquely and is never stored, inspected, logged, persisted, or sent to the frontend by @relavium/core (ADR-0038, rule 6).
resolveProvider(providerId): LlmProvider | undefined— the one genuinely-new capability: the authoredAgent.provider/fallback_chain[].providerare provider-id strings, but aFallbackPlanEntry.provideris a concrete adapter instance, which the engine cannot construct (vendor SDK +@types/nodebreak engine purity).undefined⇒ a host-wiring gap → aNodeFailure{ code: 'internal' }.resolveMediaSurface?(model): MediaSurface | undefined— the catalog projection ofmodel_catalog.media_surface(1.AG Section C, ADR-0045 §1) that selects the inline-vs-generative dispatch.'generative'routes the node to the separate-endpointgenerateMedia;'chat'— the default, and the value when this dep is absent or returnsundefined— uses the normal turn. The engine is platform-pure (no DB), so the host injects the lookup; the production catalog wiring is deferred to 1.AH (until then every model is'chat'— no generative model is runtime-reachable).registry+tools— the sharedToolRegistry(for dispatch) and itsToolDefs (the source of the LLM-visible schema + descriptions for the granted tools).keyFor/sleep/now?/onAuthError?— forwarded into the per-nodeFallbackChain(the existingFallbackChainOptionsseam — not re-declared as a parallel credential surface).onAuthError(the single out-of-band credential refresh) is host-owned.resolverCapabilities?(theread_filefilter for a prompt),fsScope?(default'sandboxed'),limits?,preEgress?.
The runner owns the cost path itself — one CostTracker per node execution and its own onAttempt→cost:updated — never a host-supplied (shared) tracker, because the executor is shared across concurrent runs.
- Resolve the agent. An absent
resolvedAgent(run-plan.md §AgentPlanConfig) ⇒NodeFailure{ code: 'validation' }naming theagent_ref(an authoring error — distinct from an unresolved provider id, which isinternal). Never a raw throw. - Build the fallback plan. Primary
{ provider, model: node.model ?? agent.model, maxAttempts: (node.retry ?? agent.retry)?.max ?? 1, backoff: (node.retry ?? agent.retry)?.backoff }(node-retry overrides the agent default) + eachfallback_chainentry. OneFallbackChainper node execution, reused across the tool loop so per-provider cooldown and the ADR-0039 strip-latch survive.- Generative fork (1.AG Section C, ADR-0045 §1/§6). After the plan + the resolved prompt, the primary model's surface is read via
resolveMediaSurface?.(primary.model) ?? 'chat'. A'generative'result dispatches to the separate-endpointgenerateMedia(one provider, no chain failover — a generative call is provider-bound) and returns early: an empty-prompt / multi-modality node failsvalidation; a pre-egress budget gate runs first (gate-only — the ADR-0028 pre-egress governor; the token estimate is pinned to zero for a token-free generative call, see the code comment); the SYNC{ media }becomes the{ text: '', media: [part] }node output (de-inlined to amedia://handle like the inline path); exactly one realizedcost:updatedis emitted (request volume × per-model media rate, degrade-to-0 on a missing rate); ajobId(async LRO) is Section D. A'chat'model continues to the turn core below.
- Generative fork (1.AG Section C, ADR-0045 §1/§6). After the plan + the resolved prompt, the primary model's surface is read via
- Narrow the tool grant.
node.toolsmust be a subset ofagent.tools— a widening attempt ⇒validation(ADR-0029). - Assemble messages.
system= authored text ONLY (agent.system_prompt+node.system_prompt_append), concatenated verbatim — 1.O does not interpolate the system role, so an untrusted{{ run.outputs }}/read_filereference can never resolve intosystem(it ships as literal authored text). The parser still collects a{{ … }}reference insystem_prompt_appendfor the parse-time secret-taint scan (catching a stray{{ secrets.* }}), but does not promise dispatch-time resolution of system fields. Only the resolvedprompt_template— which may draw on untrustedrun.outputs/read_file— lands in auserposition, neversystem(security-review.md §Prompt-injection, the structural placement guarantee — no value-level taint carrier needed for an agent node, which cannot launder a secret intorun.outputs). (A future parse-time gate that admits trusted{{ inputs }}/{{ ctx }}in system fields while rejecting untrusted sources is a recorded follow-up.) output_schema(node override wins over the agent default). Lowered toLlmRequest.responseFormat(a request-side hint), and validated node-side: the seam'sresponseFormatdoes not guarantee a schema-conformant response (DeepSeek degrades to barejson_object), so the runner parses the output and a non-JSON result ⇒validation(ADR-0038, error-handling.md). Phase-1 scope: parse-as-JSON; deep JSON-Schema conformance is a recorded follow-up (needs a validator dependency/ADR).- Run the turn core, map its result to
NodeOutcome.completed(output= the parsed structured value or the assistant text;tokensUsed = { input, output, model }), or map a classifiedAgentTurnErrortoNodeOutcome.failed.
The core streams one turn (emitting agent:token per text delta, accumulating text / tool-call / reasoning parts), and on a tool_use stop appends the assistant turn (including its reasoning ContentPart for the same-provider replay, ADR-0039), dispatches each tool call, appends the results, and continues — bounded by a runner-default max-tool-turns cap (a DoS guard; the authored hard cap + the loud turn_limit surfacing is the 1.V session knob).
The error mapping to the closed ErrorCode (error-handling.md) — cancel wins over all others:
| Source | ErrorCode |
Retryable | Note |
|---|---|---|---|
abort (ctx.signal) / ToolCancelledError / chain cancelled |
cancelled |
false | precedence over every other classification |
ToolPolicyError |
tool_denied |
false | not fed back as a correctable result (re-asking a denied tool burns budget) |
UnknownToolError / ToolArgsInvalidError |
(model-correctable) | — | converted to an isError tool result fed back, within a bounded correction budget; after it ⇒ tool_failed |
ToolExecutionError |
tool_failed |
true | |
| absent host capability | internal |
false | |
chain-exhausted LlmError |
provider_auth / provider_rate_limit / provider_unavailable / content_filter (content_filter, 1.AG/ADR-0045 §6) / validation (bad_request) / internal (unknown) |
per LlmError.retryable |
classified from error.kind, never error.message |
| max-tool-turns hit | turn_limit |
false |
The runner emits, per sse-event-schema.md (envelope-less; the bus stamps runId/timestamp/sequenceNumber):
agent:token{ nodeId, token, model }—modelis the active attempt's model (see the model-attribution note inagent-turn.ts; the accurate per-attempt model is always oncost:updated).agent:tool_call{ nodeId, model, toolId, toolInput, attemptNumber? }andagent:tool_result{ nodeId, toolId, success, outputSummary, attemptNumber? }— assembled from the registry's partialevents.call/events.result(the runner addstype+nodeId+model); the registry does not carry them.cost:updated{ nodeId, model, inputTokens, outputTokens, costMicrocents, cumulativeCostMicrocents, attemptNumber? }— one per non-skipped attempt;attemptNumbercounts non-skipped records.cumulativeCostMicrocentsis a placeholder the engine overwrites authoritatively (it owns the run-wide total).
The runner emits no budget:* / run_timeout (run-level, not in the in-node set). A pure always-pass pre-egress hook runs before each tool-loop turn's seam call — the coarse ADR-0028 insertion point 1.AC fills; the precise per-attempt budget gate (a FallbackChain makes several egresses per turn) is a chain pre-attempt hook 1.AC adds.