Federated agent execution. A server orchestrates, the client's device executes the leaves. The contract between them is JSON.
Today's LLM agents run end-to-end on the server. Every classification, embedding, and short generation crosses the network — even when the user's laptop has a perfectly capable GPU sitting idle.
NeoProtocol partitions agent execution across a server/client boundary:
- Server: frontier-model reasoning + graph decomposition (1 call per request)
- Client: bounded leaves — sentiment, classification, embedding, short summarization — running on a tiny model (~17–85 MB) downloaded once
- Wire format: JSON graph spec + JSON result envelope
The economics flip:
| Server-only | NeoProtocol | |
|---|---|---|
| Inference cost (200 reviews) | ~$0.05 | ~$0.001 (decomposition only) |
| User data sent to server | full reviews | aggregate stats |
| Latency | 200 × RTT | 1 × decomposition + local |
| Scaling ceiling | server GPUs | user device count |
LangChain.js graphs are JavaScript. Changing orchestration means redeploying the client. The server has no leverage to vary the topology per-request.
NeoProtocol graphs are JSON payloads. The server ships a fresh graph per request — A/B tests, free-vs-paid tiers, per-user customization become server-side decisions with zero client redeploy. The client is just a dynamic loader + a tiny model.
v0.3 — three of four IETF "real protocol" criteria reached: spec (~700 lines, gemini-checklist passing), reference implementation (browser + server), ≥2 independent interoperating implementations (JS browser + Python CLI executor — RFC 2026 criterion), conformance test suite (Originator Level 0, 18/18 passing). The remaining criterion (external users) needs the social path that public release unlocks.
Read order:
- SPEC.md — protocol specification (17 sections + appendix)
- PLAN.md — milestone roadmap with status table
- docs/federated-mode.md — design + cross-network safety profile for §16
- docs/roadmap-collaborative-workspace.md — vision document: how Federated Mode evolves into browser-native multi-user multi-agent coding
- CHANGELOG.md — version history
- CONTRIBUTING.md — sign-off + commit style
- conformance/ — language-neutral self-cert harness
- examples (below) — runnable demos + spec fixtures
| Conformance Level | Status | What it shows | |
|---|---|---|---|
examples/sentiment-poc/ |
0 | ✅ runnable | Single-leaf sentiment, 4-button consent (local / BYOK / browser-builtin / decline), ?server=URL mode wires to the v0.2 Originator |
examples/multi-leaf-poc/ |
1 | ✅ runnable | Fan-out (2 parallel leaves) + reducer, channels with append and replace, mixed Model A + Model B, ~80-line pure-JS executor |
examples/python-executor/ |
0 | ✅ runnable | Second independent reference Executor: Python + optimum + ONNX Runtime native CPU. Same q8 ONNX bytes as browser, identical scores — interop validation |
examples/spec-examples/01-email-triage |
1 | 📜 fixture | Large fan-out (200 items), three impl options per leaf (local / BYOK OpenAI / BYOK Anthropic) |
examples/spec-examples/02-pii-redact-conditional |
2 | 📜 fixture | Conditional edges with when predicates, T1/T3 capability split via complexity gate |
examples/spec-examples/03-clinical-scribe-interrupt |
3 | 📜 fixture | interrupt_before per-leaf consent, deeply chained Model B for healthcare-specific logic |
examples/p2p-acp-poc/ |
Federated | ✅ runnable | Browser↔browser agents. ACP (Zed Agent Client Protocol) over WebRTC RTCDataChannel. Two signaling modes: Originator-as-relay (Standard) and SDP-via-URL (zero-server Minimal). See SPEC §16 and docs/federated-mode.md |
examples/cowork-poc/ |
Workspace Stages 1+2+3+5 | ✅ runnable | Two browsers, one document, two agents — agents talk, can run no-cloud, and now a single decomposed Task Offer fan-outs across both peers with per-leaf attribution. Y.js CRDT over a Workspace data channel + cross-agent ACP on a second multiplexed channel + Task Offer fan-out via Y.Map broadcast (§17.8). Agent dropdown: OpenAI (default) / Anthropic / Local (transformers.js + WebGPU) / Mock. Stage 5 = the merge of §6/§7 Task Offer plane and §16/§17 Workspace plane: deterministic hash(leaf_id) % sorted(client_ids) assignment, race-free by construction. See SPEC §17.8 and roadmap |
Runnable = working code, end-to-end verified against the
reference Originator. Fixture = JSON-only worked example,
validates against the canonical schemas (node examples/spec-examples/validate.mjs).
cd examples/sentiment-poc
python3 -m http.server 8000
# open http://localhost:8000 in a Chrome/Edge/Safari tabThe page fetches graph.json (a hardcoded Task Offer), shows a
consent dialog with three runtime choices (local ONNX / BYOK OpenAI
key / browser built-in AI), runs the leaves, and prints the result
envelope that would post back.
# terminal 1 — Originator
cd server
npm install && npm start # listens on :3001
# terminal 2 — static page
cd examples/sentiment-poc
python3 -m http.server 8000
# browser
http://localhost:8000/index.html?server=http://localhost:3001The page now starts with a prompt textarea, POSTs to /tasks, the
stub decomposer matches the prompt to a fixture, the offer comes
back, you pick a runtime, the leaves run locally, the envelope POSTs
to /tasks/:id/results, and the server's ack appears in the page.
Server log records the round-trip. Server-side ajv schema validation
- data_locality whitelist enforcement (defense in depth) along the way.
- Model domain mismatch. DistilBERT-SST2 is trained on movie
reviews. Product-review cues like "returning it" or "chemical smell"
can fool it. r8 in the sample set is a known miss. This is a
picked-model issue, not a protocol/runtime issue — swap in a product-
domain classifier (e.g.
Xenova/twitter-roberta-base-sentiment-latest) for sturdier results in your own deployment. - q8 + WebGPU is unsafe in transformers.js v3 browser. The picker
in
index.htmlskips this combo (q8 → WASM EP only; WebGPU needs fp16/fp32). If you setdevice_pref: ["webgpu"]ANDquantized: truethe picker falls through to wasm. - No real server yet.
graph.jsonis static; result envelope is logged to console rather than POSTed.
- v0 (now): static task offer, sentiment-only PoC
- v0.1: BYOK path — user supplies own API key, leaves run via that
- v0.2: real server in
server/(Node/Python) generating graphs from natural-language requests via a frontier model - v0.3: capability negotiation handshake + Federated Mode (browser↔browser ACP-over-WebRTC, two signaling modes — Originator rendezvous and SDP-via-URL; SPEC §16)
- v1: streaming results, multi-node graphs, decline-fallback, TURN policy + room auth tokens for Federated Mode
- future: NeoGraph WASM runtime as drop-in reference engine (replaces ad-hoc JS leaf executor with the full graph engine)
Apache License 2.0. See LICENSE and NOTICE.
The Apache 2.0 license includes an explicit patent grant from each contributor and a patent retaliation clause — chosen over MIT specifically because protocols are vulnerable to patent troll attacks and Apache 2.0's automatic license termination on hostile patent litigation provides a real defense mechanism. Industry-standard for protocols (gRPC, OpenAPI, Kubernetes API).