- Status: Accepted
- Date: 2026-05-29
- Deciders: @cemililik
ADR-0030 settles how a syscall is made — the register calling convention (x8 = number, x0–x5 = arguments, x0 = status, x1–x7 = payload), the SVC #0 trap, and the SyscallError space. This ADR settles which syscalls exist in the B-phase and the concrete per-call register layout each one instantiates.
The set must be small. Phase B § B5 names the floor and the ceiling in one sentence: "At minimum: send, recv, console_write (debug-gated), task_yield, task_exit. No more in v1." The reasoning is the same "smallest shape that works now" discipline ADR-0029 (raw-flat image) and ADR-0035 (bitmap PMM) applied: every syscall is a permanent piece of the userspace ABI surface and a panic-free dispatch path the kernel must keep correct forever. Adding a syscall is cheap to write and expensive to ever remove, so v1 ships exactly the calls B6's first "hello" userspace task needs to do useful work and exercise the boundary — and not one more.
What B6's first userspace task must be able to do: print to the serial console (console_write), exit cleanly so the kernel reclaims it (task_exit), and — to prove the IPC path works end-to-end across the privilege boundary, not just kernel-internally — send and receive on an endpoint (send / recv). task_yield rounds out cooperative multitasking from EL0. Capability-management syscalls (cap_copy / cap_derive / cap_revoke), notify, and address-space map/unmap are deliberately not exposed in v1 — no v1 userspace consumer needs them, and the kernel-internal surfaces remain reachable only from EL1.
The stakes: a syscall number, once a userspace binary depends on it, is part of a stable contract. Getting the set wrong (too large) is unused attack surface in the dispatcher; getting it too small blocks B6. Getting a per-call layout wrong means re-churning the tyrne-user wrapper. The decision is bounded because the numbers and layouts are pinned by host ABI tests when the dispatcher lands (T-021), so an error is caught mechanically rather than at runtime.
- B6 sufficiency. The set must be exactly enough for B6's "hello from userspace" + clean exit + an IPC round-trip, and no more. See phase-b §B6.
- Panic-free dispatch surface. Every syscall is a handler the dispatcher must keep panic-free on all untrusted input (ADR-0030, B0 hardening). Fewer syscalls = smaller audited surface.
- Register-budget fit. Each call's arguments must fit in
x0–x5(six words) and its results inx0–x7per ADR-0030, without spilling to a stack-passed block that would need its own copy-from-user validation. The widest call drives the budget. - Reuse of existing kernel surfaces. A syscall handler should be a thin validator + a call into an existing kernel primitive (
ipc_send/ipc_recv/ scheduleryield_now/ the console HAL), not new subsystem logic. The syscall layer adds the EL0 boundary, not new capability semantics. - Defence-in-depth on the number space. An uninitialised
x8(zero) must not accidentally name a real syscall; a release build must not expose a debug-only console. - No capability authority widening. The syscalls expose operations the caller's capabilities already authorise; the syscall layer is a gate, never a new grant.
console_writeis the one debug affordance and is gated accordingly.
- The phase-b floor set:
send,recv,console_write,task_yield,task_exit(five). Exactly what B6 needs. - A larger "useful from day one" set adding
notify,cap_copy/cap_derive/cap_revoke, and address-spacemap/unmap— so userspace can manage its own capabilities and memory without a later ABI bump. - An ultra-minimal set:
console_write+task_exit(two). The absolute floor to make B6's greeting + exit work, deferring even IPC from EL0 to a later milestone.
Chosen option: Option 1 — the five-syscall phase-b floor set.
It is the smallest set that lets B6's first userspace task both do something observable (console_write, task_exit) and exercise the boundary that B5 exists to build (send / recv cross the EL0→EL1 line, proving capability-gated IPC works from userspace, not just kernel-internally). task_yield makes cooperative multitasking reachable from EL0 with a near-zero-cost handler. Option 3 is too small — it would ship a syscall boundary that never carries an IPC message, leaving the most security-relevant path (capability-gated send/recv from untrusted EL0) unexercised until a later milestone, which defeats the point of building the boundary now. Option 2's extra calls have no v1 consumer; each would be unused dispatch surface to keep panic-free, and IpcError/CapError are #[non_exhaustive], so the set can grow without breaking the ABI when a real consumer appears.
Numbers instantiate ADR-0030's convention: x8 = number, arguments in x0–x5, x0 = status (0 = Ok), payload in x1–x7. Number 0 is reserved-invalid (an uninitialised x8 must fault, not dispatch) and always returns SyscallError::BadSyscallNumber. The integers 1–5 below are a fixed decision, not tentative: as an Accepted ABI ADR, this table is the contract. T-021's host tests regression-verify these numbers and layouts; they do not get to choose them.
x8 |
Name | Arguments (x0…) |
Returns (x0=status, then payload) |
Capability checked | Backing primitive |
|---|---|---|---|---|---|
0 |
(reserved-invalid) | — | always BadSyscallNumber |
— | — |
1 |
send |
x0=ep cap handle, x1=msg.label, x2..x4=msg.params[0..3], x5=transfer cap handle (or the reserved null-handle sentinel = "no transfer") |
x1=SendOutcome (0=Delivered, 1=Enqueued) |
endpoint cap (SEND) |
ipc_send |
2 |
recv |
x0=ep cap handle |
x1=RecvOutcome (0=Received, 1=Pending), x2=msg.label, x3..x5=msg.params[0..3], x6=transferred cap handle (or null sentinel if none) |
endpoint cap (RECV) |
ipc_recv |
3 |
task_yield |
— (args ignored) | (no payload) | self (current task) | scheduler yield_now |
4 |
task_exit |
x0=exit code |
does not return to the caller | self (current task) | scheduler task-termination (B6) |
5 |
console_write |
x0=debug-console cap handle, x1=user VA of byte buffer, x2=length |
x1=bytes written |
debug-console cap (write) | console HAL write_bytes, via copy-from-user |
Notes that bind the table:
send/recvcarry theMessagein registers, not via a user-pointer buffer. AMessageis four words (label+ threeparams); register-passing fits the ADR-0030 budget (sendusesx0–x5for args;recvreturns inx1–x6) and avoids a copy-from/to-user round-trip on the common small-message path. When messages grow past the register budget (post-v1), a pointer-buffer variant lands without disturbing these numbers. The null-handle sentinel that means "no transfer" / "no cap received" is a reservedCapHandlevalue no live handle can take; its exact bit pattern is T-021's encoder detail (it must round-trip withOption<CapHandle>).- Every syscall that names a separate object is capability-gated, per P1 / P4.
send/recvcheck the endpoint capability (SEND/RECV);console_writechecks a debug-console capability (its first argument,x0).task_yield/task_exitact on the caller's own task — the kernel identifies the caller from its trusted current-task pointer (set at dispatch, not a forgeable argument). This is the caller's inherent authority over its own execution thread, not ambient authority over another object, so these two take no object-capability argument; the trust-boundary check P4 demands is "is there a valid current task?" (always true on the syscall path) plus the kernel never letting the caller name a different task. No syscall reaches a privileged effect on another object without a capability. console_writeis capability-gated and debug-gated — two independent gates. (1) Capability gate (authority): the caller must hold a debug-console capability (argx0); the dispatcher validates it (resolves, kind = debug-console, carries the write right) before any output, returning a typedSyscallErrorotherwise — this is the P1 / P4-mandated authority check, present in all builds. The concreteCapObjectkind for the debug console and its grant-at-load wiring are T-021 (object + check) and B6 (grant to the first userspace task). (2) Debug gate (defence-in-depth): in a non-debug build the dispatcher additionally treats number5as unknown and returnsBadSyscallNumber, so the debug console is absent from the production syscall surface even for a holder of the capability. The exact debug-gate mechanism (cfg!(debug_assertions)arm vs. a Cargo feature) is T-021's implementation choice; the two-gate contract is fixed here.console_writeis the only syscall that takes a user pointer: its handler validates[ptr, ptr+len)against the active address space via copy-from-user before touching a byte (B5 sub-item 5); it never dereferences the raw pointer.task_exitdoes not return. Control does not come back to the caller, so the ABI defines no return value for it. Its real semantics — mark the EL0 task terminated, drop its context, dispatch the next ready task — depend on the per-task EL0 context register file that does not exist until B6 (gated on the ADR-0033 high-half placeholder). T-021 implements the dispatch and a kernel-stub stand-in; the real EL0-task termination lands with B6's first userspace task.task_yieldalways succeeds in v1 (statusOk); it is a thin EL0-reachable wrapper over the scheduler's cooperativeyield_now, acting on the caller's own task.
Representative invocations walking the ADR-0030 convention ((state-pre, action, state-post, observable)):
| Step | State pre | Action | State post | Observable effect |
|---|---|---|---|---|
| 0 | caller: x8=5 (console_write), x0=valid debug-console cap, x1=buf VA, x2=len; debug build; buf mapped in active AS |
dispatch → cap check on x0 passes → copy-from-user validates [buf,buf+len) → console write_bytes |
bytes emitted on serial | x0←0 (Ok), x1←len; no raw user-ptr deref |
| 1 | caller: x8=5, x0=stale / wrong-kind / no-write debug-console cap |
dispatch → cap check on x0 fails before any output |
unchanged | x0←typed SyscallError (Cap/Ipc-family); console untouched — authority gate, all builds |
| 2 | caller: x8=5, x0=valid cap, x1=buf VA, x2=len; buf not mapped in active AS |
cap check passes; copy-from-user range check fails | unchanged | x0←FaultAddress; kernel never read the buffer |
| 3 | caller: x8=2 (recv), x0=ep cap; a sender already delivered {label, params, cap} |
ipc_recv → Ok(Received{msg, cap}); install cap into caller table |
endpoint Idle; cap in caller table |
x0←0, x1←0(Received), x2←label, x3..x5←params, x6←new cap handle |
| 4 | caller: x8=4 (task_exit), x0=code |
mark caller (the current task) terminated; dispatch next ready task | caller gone; scheduler runs another task | no return to caller; kernel reports termination (B6) |
The second, independent release debug-gate (number 5 → BadSyscallNumber in non-debug builds, even for a capability holder) is not a separate row — it short-circuits dispatch ahead of row 0's cap check and is covered in the binding note above.
Per the write-adr skill §Procedure step 5 sub-bullet, all rows are discharged by T-021 (the dispatcher task), because every row is a trampoline/dispatch behaviour:
- Row 0 (
console_writehappy path, cap check passes) → T-021 host copy-from-user test + the QEMU kernel-stub-SVC smoke trace showing the emitted bytes. - Row 1 (debug-console capability check fails) → T-021 host dispatcher test asserting a stale/wrong-kind/no-write cap yields a typed
SyscallErrorwith no console output (the P1 / P4 authority gate). - Row 2 (
FaultAddress) → T-021 host copy-from-user out-of-range test. - Row 3 (
recvregister unpack) → T-021 host ABI encode/decode round-trip test overRecvOutcome+Message+Option<CapHandle>. - Row 4 (
task_exitno-return) → T-021 dispatcher test (kernel-stub stand-in); real EL0 termination → B6. - The release debug-gate → T-021 host dispatcher test asserting number
5→BadSyscallNumberundernot(debug_assertions).
The IPC error-taxonomy rows these syscalls inherit (a send to a stale/wrong-kind/no-SEND cap) are discharged by T-020 per ADR-0030 §Simulation row 3. The runtime EL0-vs-EL1 verification split (B5 kernel-stub via the current-EL 0x200 vector vs. B6 real EL0 via 0x400) is recorded in ADR-0030 §Simulation row-to-verification mapping.
For this decision to be fully in effect:
1. Syscall calling convention + SyscallError space. — ADR-0030 (opens with this ADR)
2. Panic-free dispatcher decoding x8 → one of {1..5}, else
BadSyscallNumber; number 0 reserved-invalid. — T-021 (opens with this ADR)
3. Handlers wiring each syscall to its backing primitive
(ipc_send/ipc_recv/yield_now/console write_bytes/terminate). — T-021
4. Debug-console capability kind (CapObject) + the dispatcher's
capability check for console_write (the P1/P4 authority gate). — T-021 (object + check)
5. copy-from-user for console_write's buffer. — T-021
6. The release debug-gate mechanism for console_write. — T-021 (design-notes choice)
7. EL0-ready Task context so task_exit/task_yield have a real EL0
task to terminate/reschedule. — ADR-0033 (placeholder) + Phase B6
8. Debug-console capability granted to the first userspace task. — Phase B6
9. tyrne-user safe wrappers exposing these five calls. — Phase B6 (deferred)
Steps 1–6 are grounded in ADR-0030 + T-021, opened in the same commit set as this ADR per ADR-0025 §Rule 1. Steps 7–9 are explicit forward-flags (the ADR-0033 high-half placeholder and B6), the same shape ADR-0029 used for its deferred build-pipeline step. Until step 7 lands, the five syscalls are exercised by an EL1 kernel-stub caller (B5 acceptance criterion #7), not a real EL0 task.
- The dispatch surface is minimal and fully audited. Five real syscalls + one reserved-invalid number; every handler is a thin validator over an existing kernel primitive. Nothing to keep panic-free that no consumer needs.
- The boundary is exercised, not just built.
send/recvfrom EL0 prove capability-gated IPC across the privilege line — the highest-value B5 test — rather than deferring it. - Every object-naming syscall is capability-gated; no ambient authority.
send/recvcheck the endpoint cap,console_writechecks a debug-console cap, andtask_yield/task_exitact only on the caller's own (trusted-current-task) identity — upholding P1 / P4 uniformly across the v1 set. 0-reserved + the release debug-gate are defence-in-depth on top of the capability gate. An uninitialised syscall number faults; production builds dropconsole_writefrom the surface entirely even for a capability holder.- Register-passing keeps the common path allocation-free and copy-free. Only
console_writetouches user memory, so only one handler carries the copy-from-user cost;send/recvstay register-only. #[non_exhaustive]error spaces mean the set can grow safely. Addingnotifyorcap_*later is additive — new numbers, newFrompaths — with no break to the v1 five.
- No userspace capability management in v1. A v1 EL0 task cannot
cap_copy/derive/revokeits own caps; it works only with the caps the loader/parent granted. Mitigation: no v1 userspace needs this; the kernel-internal cap operations remain available, and the syscalls land when a real consumer (a multi-task userspace service, post-B6) surfaces. - No
notifyfrom EL0. An EL0 task cannot signal a notification. We accept this — v1's notification users are kernel-internal; the syscall is additive later. Messageis register-bound to four words. A larger payload needs a pointer-buffer variant. Mitigation: the four-wordMessageis fixed by ADR-0017; a wider message is a separate ADR with its own syscall number, leaving these layouts intact.task_exitsemantics are only half-real in B5. Until B6's EL0 context exists,task_exitis dispatcher-plumbing over a kernel-stub. Mitigation: the ABI shape ("does not return") is fixed now; the termination behaviour lands with the task it terminates.
- Syscall numbers
0–5are a fixed decision, not tentative. As an Accepted ABI ADR, this table is the contract; T-021's host tests regression-verify the numbers and layouts, they do not choose them. - The release debug-gate mechanism is left to T-021 (a
cfg!(debug_assertions)arm vs. a Cargo feature) — but the gate's existence and the capability check are both fixed decisions here. - A new
CapObjectkind (debug console) lands in T-021. This is the smallest object addition that keepsconsole_writecapability-gated; it is the first capability kind introduced by a syscall rather than by the kernel-object subsystem directly. - The set maps one-to-one onto the future
tyrne-usercrate's public API. B6's wrapper crate exposes exactly these five.
- Pro: Exactly B6's needs; smallest panic-free dispatch surface.
- Pro: Exercises the capability-gated IPC boundary from EL0 (the key B5 test).
- Pro: Register-only for four of five calls; one copy-from-user path.
- Con: No EL0 cap-management /
notifyin v1 (additive later; no v1 consumer).
- Pro: Userspace can manage its own caps + memory without a later ABI bump.
- Con: Every added call is unused dispatch surface that must be kept panic-free with no v1 consumer to validate it — speculative ABI.
- Con: Larger audited attack surface at the most security-sensitive boundary, for zero v1 benefit.
- Con:
#[non_exhaustive]already makes growth non-breaking, so the "avoid a later bump" pro is moot.
- Pro: Absolute smallest path to B6's greeting + exit.
- Con: Ships a syscall boundary that never carries an IPC message — the most security-relevant EL0→EL1 path (capability-gated
send/recv) stays unexercised, defeating the purpose of building the boundary in B5. - Con:
task_yieldis near-free to add and makes cooperative EL0 multitasking reachable; omitting it is false economy.
- Phase B §B5 — Syscall boundary — the floor/ceiling sentence this ADR implements.
- Phase B §B6 — First userspace "hello" — the consumer that justifies the set.
- ADR-0030 — Syscall ABI and userspace error taxonomy — the convention these calls instantiate.
- ADR-0017 — IPC primitive set —
send/recv/notifyand the four-wordMessageshape. - ADR-0014 — Capability representation — the
CapHandlethe null-sentinel reserves against. docs/architecture/ipc.md— the IPC operationssend/recvwrap.- seL4 manual §"System Calls" — minimal syscall-set prior art for a capability kernel.