[Bug]: model_switch tool does not persist across turns; gateway/UI path ignores it entirely

### Affected component

runtime/daemon

### Severity

S2 - degraded behavior

### Current behavior

The `model_switch` tool advertises that it switches the active model "immediately for the current conversation"
(`crates/zeroclaw-runtime/src/tools/model_switch.rs:26`), but in practice the
switch is silently lost in both major entry paths.

**Path A — channel orchestrator (`process_channel_message` → `handle_message`)**

When the LLM calls `model_switch`, the tool sets a process-global
`MODEL_SWITCH_REQUEST` (`crates/zeroclaw-runtime/src/agent/loop_.rs:91`).
`run_tool_call_loop` picks it up and bubbles it out as an error;
the orchestrator handler at
`crates/zeroclaw-channels/src/orchestrator/mod.rs:3189-3217` catches it,
builds a new provider, mutates the **local** `route` variable, clears the
global flag, and continues the current loop:

    Ok(new_prov) => {
        active_provider = Arc::from(new_prov);
        route.provider = new_provider;          // local var only
        route.model = new_model;                // local var only
        clear_model_switch_request();
        // ❌ no set_route_selection(ctx, &history_key, route.clone())
        continue;
    }

The change is never written back to `ctx.route_overrides`. The next inbound
message calls `get_route_selection(ctx, &history_key)`
(`orchestrator/mod.rs:2636`), reads the stale override (or the default), and
runs on the original provider/model. Compare with the `/model` slash command
handler (`orchestrator/mod.rs:1791, 1830`) which correctly calls
`set_route_selection` to persist.

The same handler also has two related defects:

- it uses `ctx.api_key` (`mod.rs:3191`) — the startup global key — instead of
  the route-specific `api_key` from `ctx.model_routes`, so a switch into a
  provider that needs a different key fails with auth errors;
- the freshly built provider is **not written into `provider_cache`**, so each
  switch rebuilds a provider instance.

**Path B — gateway / built-in daemon UI (`/ws/chat`, `/webhook`)**

The gateway's WebSocket chat (`crates/zeroclaw-gateway/src/ws.rs:166`)
constructs a fresh `Agent` via `Agent::from_config` and runs
`agent.turn_streamed`, which calls `self.provider.stream_chat` directly
(`crates/zeroclaw-runtime/src/agent/agent.rs:1122`). It does not invoke
`run_tool_call_loop`, and `grep MODEL_SWITCH agent/agent.rs` returns zero
matches. `ModelSwitchTool` is registered in `all_tools_with_runtime`
(`tools/mod.rs:368`), so the LLM can call it and the global
`MODEL_SWITCH_REQUEST` gets set — but **nothing consumes it on this path**,
so the switch is a complete no-op for the duration of the WS connection.
`/webhook` (`run_gateway_chat_simple`, `lib.rs:1374-1384`) has the same
property: it calls `state.provider.chat()` directly with no tool loop.

### Expected behavior

`model_switch` should either persist the change for at least the rest of the
sender's conversation in every path where the tool is exposed, or — if the
intent really is "current turn only" — it should be removed from paths where
even that is impossible, and the tool description should reflect the actual
guarantee.

**Preferred fix — eliminate the divergence at the source:** implement the
built-in web UI's `/ws/chat` (and ideally `/webhook` plus the WhatsApp / Linq
/ Nextcloud Talk gateway endpoints) as a proper channel under
`zeroclaw-channels`, so every inbound message flows through
`process_channel_message → handle_message → run_tool_call_loop` like any
other channel. That single change makes `model_switch`, `/model`, sticky
`route_overrides`, classifier-based routing, the `provider_cache`, the config
mtime hot-reload, autosave-on-message, and all future channel-level features
work uniformly across CLI, Telegram, Discord, the daemon UI, and webhooks —
without each gateway entrypoint reinventing a parallel, partially-broken
agent loop. The current bifurcation (`Agent::turn_streamed` vs.
`run_tool_call_loop`) is the *root cause* of this bug class; surface fixes on
either side will keep diverging.

If the architectural unification above is out of scope for this fix, the
narrower per-path repairs are:

- **Channel path**: after a successful in-loop swap, call
  `set_route_selection(ctx, &history_key, route.clone())` so the new
  provider/model survives into the next message; resolve the per-route
  `api_key` from `ctx.model_routes` (mirroring the `SetModel` slash-command
  handler) instead of falling back to `ctx.api_key`; and seed the
  `provider_cache` with the new instance.
- **Gateway/UI path** (only as a stopgap until the unification above): either
  (a) consume `MODEL_SWITCH_REQUEST` inside `Agent::turn_streamed` between
  iterations and rebuild `self.provider`, or (b) drop `ModelSwitchTool` from
  the gateway-built tool registry and surface a clear error to the LLM if it
  is invoked.

In all cases, the tool description in `tools/model_switch.rs:26` should state
the actual scope of the switch.

### Steps to reproduce

```bash
# Channel path (e.g., Telegram, Discord, CLI):
# 1. Start daemon with default provider = anthropic, model = claude-sonnet-4-6.
# 2. Send a message that prompts the LLM to call:
#      model_switch { action: "set", provider: "openai", model: "gpt-4o" }
# 3. Observe that within the same turn, subsequent LLM calls go to gpt-4o.
# 4. Send a second user message in the same conversation.
# 5. Observe that the agent is back on anthropic/claude-sonnet-4-6.

# Gateway/UI path:
# 1. Start daemon, open the built-in web UI and connect to /ws/chat.
# 2. Prompt the model to call model_switch as above.
# 3. Observe: the tool returns "Model switch requested" but every subsequent
#    streamed turn is still served by the original provider/model. This holds
#    for the entire WS connection's lifetime.
```

### Impact

Affected users: anyone relying on `model_switch` for runtime model swapping
(agents using cost-tier routing, fallback escalation, or "use a strong model
for this one task" patterns) — i.e. exactly the use case the tool advertises.

Frequency: always.

Consequence: the tool silently lies. On the channel path it appears to work
within a single turn but reverts on the next message; on the gateway/UI path
it is a complete no-op. Models that rely on the documented "switch takes
effect immediately" semantics will plan around guarantees that are not
actually delivered, leading to wrong-model responses, unexpected billing, and
hard-to-debug routing behaviour.

### Logs / stack traces

N/A — this is a logic defect, not a crash. The misleading
`{"message":"Model switch requested", ...}` tool result in
`tools/model_switch.rs:156-164` is itself the smoking gun: the tool reports
success even when nobody downstream will act on it.

### ZeroClaw version

`master` at `eebd7b634f91c37f7a976e03ba3f29d9b76a1ca9`.

### Operating system

Linux

### Regression?

Unknown

### Pre-flight checks

- [x] I reproduced this on the latest master branch or latest release.
- [x] I redacted secrets, tokens, and personal data from all submitted content.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: model_switch tool does not persist across turns; gateway/UI path ignores it entirely #6173

Affected component

Severity

Current behavior

Expected behavior

Steps to reproduce

Impact

Logs / stack traces

ZeroClaw version

Operating system

Regression?

Pre-flight checks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: model_switch tool does not persist across turns; gateway/UI path ignores it entirely #6173

Description

Affected component

Severity

Current behavior

Expected behavior

Steps to reproduce

Impact

Logs / stack traces

ZeroClaw version

Operating system

Regression?

Pre-flight checks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions