You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
mcp: server modes, elicitation gating, audit subsystem (PR2) (#652)
* mcp: server modes, elicitation gating, audit subsystem (PR2)
* robot: use canonical mocked SQL in new mode-contract scenarios
The new mode-contract scenarios (MCP HTTP Mode Read Only / Delete Safe /
Full Access / Audit Basic / Audit Disabled) all failed in CI with rc=2.
Root cause: I synthesised SQL like
select name from google.compute.networks where project = 'stackql-demo'
delete from google.compute.firewalls where project = 'p' and firewall = 'f'
against the test registry. google.compute.networks is registered as
metadata (used by describe_resource / list_methods in earlier scenarios)
but is not mocked as an executable SELECT target. The synthetic DELETE
values 'p' / 'f' don't match the canonical mock either. The bundled
mcp_client panics on unknown-target query errors with exit code 2.
Switch all five scenarios to the canonical mocked SQL already used by
the working MCP HTTPS Server Query / Exec Query Canonical scenarios:
select name, id from google.storage.buckets where project = 'stackql-demo'
delete from google.compute.firewalls where project = 'mutable-project' and firewall = 'deletable-firewall'
The MCP HTTP Mode Safe Refuses Mutations Without Elicitation scenario
already happened to pass (it expected rc != 0, which the synthetic SQL
also produced) but the same SQL substitution makes its intent clearer.
Also:
- Clean up six empty stackql_mcp_server_*.log files inadvertently
committed in the PR2 squash. These were left by unit tests that
exercised audit-on paths before connectInProcess gained its disable-
audit default.
- Add stackql_mcp_server_*.log to .gitignore so future stray audit
files don't get tracked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* robot: fix Windows audit-path failure; preflight 9923 reachability
MCP HTTP Audit Basic Records Tool Calls failed on Windows CI with
rc=1 on the first SELECT against the audit-enabled 9923 server.
Root cause: the audit log path embedded in mcp.config was
`${CURDIR}/tmp/mcp-audit-9923.log`. On Windows, ${CURDIR} expands to
`D:\a\stackql\stackql\test\robot\functional`. Inside a JSON string,
the resulting `\a` `\s` `\t` etc are invalid backslash escapes, so
json.Unmarshal in cmd/mcp.go's runMCPServer fails silently and
config stays as zero-value. The server then binds the default
address 127.0.0.1:9876 instead of 9923, the client sees connection
refused, and rc=1.
Fix the embedded path to a relative filename `mcp-audit-9923.log`.
The stackql process resolves it through filepath.Abs against its cwd
(the directory robot was invoked from), so the file lands in the
repo root regardless of OS path-separator semantics. Update the
test's Get File call to use ${EXECDIR} to match.
Also:
- Add a preflight server_info call at the start of the audit test
so a future audit-server startup failure surfaces with a clear
message ("9923 server must be reachable...") instead of a generic
rc-comparison failure.
- Add mcp-audit-*.log to .gitignore so the audit file from a robot
run doesn't get accidentally committed.
The audit-disabled scenario is unaffected (its glob
stackql_mcp_server_*.log does not match mcp-audit-9923.log).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review updates
* updates
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@@ -18,11 +18,43 @@ We have a nice debug config for running an MCP server with `vscode`, please see
18
18
19
19
```
20
20
21
-
To run a read-only server (mutation and lifecycle tools refuse to execute):
21
+
The default mode is `safe`: SELECTs proceed, mutations and lifecycle operations require user approval via the [MCP elicitation flow](#server-modes). An audit log is written to `./stackql_mcp_server_<RFC3339-utc-second>.log` by default (see [Audit Log](#audit-log)).
22
+
23
+
### Mode examples
24
+
25
+
Read-only (mutations and lifecycle refused immediately):
|`run_mutation_query`| KV | Execute INSERT/UPDATE/REPLACE/DELETE. **Real side effects.** Returns `{messages, timestamp}`. Refused in read-only mode. |
103
-
|`run_lifecycle_operation`| KV | Execute a stackql `EXEC` lifecycle operation. Returns `{messages, timestamp}`. Refused in read-only mode. |
137
+
|`run_mutation_query`| KV | Execute INSERT/UPDATE/REPLACE/DELETE. **Real side effects.** Returns `{messages, timestamp}`. Gated by the server [mode](#server-modes). |
138
+
|`run_lifecycle_operation`| KV | Execute a stackql `EXEC` lifecycle operation. Returns `{messages, timestamp}`. Gated by the server [mode](#server-modes). |
104
139
105
140
## Canonical agent prompts
106
141
@@ -111,6 +146,89 @@ One static prompt is published:
111
146
`EnabledTools` and `EnabledPrompts` on `Config` are independent allowlists. When omitted or empty everything is published; when populated they restrict the published surface to the named items. See [the `pkg/mcp_server` README](/pkg/mcp_server/README.md) for details.
112
147
113
148
149
+
## Server modes
150
+
151
+
`Config.Server.Mode` picks one of four safety contracts. All four allow SELECTs and metadata reads; they differ in how they handle mutations and lifecycle operations.
- If the client advertised elicitation at initialise, the server sends an `elicitation/create` request describing the action (tool name, query class, SQL). Branch on the user response: `accept` -> proceed; `decline` or `cancel` -> return an error.
165
+
- If the client did **not** advertise elicitation, the tool is refused with a message that points the operator at `full_access` mode.
166
+
167
+
The bundled `stackql_mcp_client` does NOT advertise elicitation, so against a `safe` or `delete_safe` server every mutation/lifecycle call is refused with the no-elicitation message. This is by design - the bundled client exists for scripting and regression tests, not interactive use. Elicitation-capable MCP clients (eg Claude Desktop, Cursor) prompt the user normally.
168
+
169
+
### Breaking change vs PR1
170
+
171
+
PR1 had a single `read_only: true/false` flag with a default of "no enforcement; mutations proceed." PR2 replaces that flag with `mode: safe` as the default, which means **mutations now require user approval out of the box.** Operators running an elicitation-capable client see one approval prompt per mutation. Operators running automation or the bundled client must explicitly opt into `full_access`.
172
+
173
+
The legacy `read_only: true` JSON / YAML key still parses for back-compat and is treated as equivalent to `mode: read_only`. `mode` wins when both are set.
174
+
175
+
176
+
## Audit log
177
+
178
+
Every tool call writes one JSONL record. The audit answers "what did the agent do," not "what did the agent see" - result rows from SELECTs are intentionally not recorded.
-`audit.file.path` -- a complete file path (absolute, or relative to cwd).
207
+
-`audit.file.dir` -- a directory; the sink chooses the basename (`stackql_mcp_server_<RFC3339-utc-second>.log`) inside it.
208
+
209
+
When neither is set in `mcp.config`, the MCP server defaults `dir` to cwd so existing operators see PR2 behaviour. The underlying [generic `pkg/sink`](/pkg/sink) package itself refuses to silently pick a directory -- the "where do logs land" decision is always made by the caller (here, the MCP server's `initAuditSink`).
210
+
211
+
The resolved absolute path is logged to stderr at startup as `sink file: /path/to/file.log`.
212
+
213
+
### Failure modes
214
+
215
+
When the sink returns an error, the response behaviour depends on `failure_mode`:
216
+
217
+
| failure_mode | Effect |
218
+
|---|---|
219
+
|`strict` (default) | The tool call returns the audit error to the client, even if the underlying tool succeeded. Intentional: better an ambiguous client than an undetected DELETE. |
220
+
|`strict_mutations`| SELECT / metadata reads proceed; mutations and lifecycle ops surface the audit error. |
221
+
|`best_effort`| Always log to stderr and proceed. |
222
+
223
+
### Sequencing
224
+
225
+
The audit write happens AFTER the tool executes (or is skipped because it was gated out) but BEFORE the response returns to the client. In strict mode an audit-write failure on a successful DELETE means the row is gone but the client sees an error - by design.
226
+
227
+
### Breaking change vs PR1
228
+
229
+
PR1 had no audit subsystem; PR2 enables audit by default. To preserve PR1 behaviour, pass `"audit": {"disabled": true}` in `mcp.config`.
230
+
231
+
114
232
## Example responses
115
233
116
234
Responses below show the typed structured payload (what `--exec` prints as JSON). The MCP `Content` block also carries a rendered text view (markdown table or KV) that the LLM consumes; it is not shown here.
## Example audit log line (`mcp-audit.log`) for the call above.
352
+
{"timestamp":"2026-05-16T00:47:33Z","tool":"run_mutation_query","mode":"full_access","decision":"allow","query_class":"mutation_delete","sql":"delete from google.compute.networks where project = 'stackql-demo' and network = 'returning-test-01' ;","args":{"sql":"delete from google.compute.networks where project = 'stackql-demo' and network = 'returning-test-01' ;","row_limit":0},"duration_ms":42}
0 commit comments