Skip to content

Add model-judge provider#153

Merged
pbdeuchler merged 7 commits into
masterfrom
claude/model-judge-feature-omhh6u
Jun 16, 2026
Merged

Add model-judge provider#153
pbdeuchler merged 7 commits into
masterfrom
claude/model-judge-feature-omhh6u

Conversation

@pbdeuchler

@pbdeuchler pbdeuchler commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Summary

Adds model-judge slots for models.default and models.subagent. A model-judge slot is implemented as a synthetic ModelJudgeProvider, so the runtime still sees an ordinary Provider and the fanout/judging/default-model handoff stays inside the provider boundary.

Flow

  1. Panel - the original request, including context and tool specs, is sent to every model in models.model_judge.panel concurrently.
  2. Synthesis - the models.model_judge.synthesis model receives the labeled panel responses plus the injected rank_responses tool. It emits ranking telemetry, then writes synthesis guidance that judges the candidate responses rather than merging them directly.
  3. Default - the synthesis text is appended as transient internal guidance to the request sent to models.model_judge.default. The default model's stream is returned to the caller and still owns all real tool execution.
  4. Fallback - if the panel produces no usable responses, or synthesis fails, the provider falls back to the default model alone.

The synthesis guidance is not persisted to the session transcript and does not require a protocol-level message variant.

Config

models.default and models.subagent accept either the existing inline model table or the string "model_judge", which references one shared [models.model_judge] block:

[models]
default = "model_judge"
subagent = "model_judge"

[models.model_judge.default]
provider = "anthropic"
model = "claude-opus-4-8"

[models.model_judge.synthesis]
provider = "openai"
model = "gpt-5"

[[models.model_judge.panel]]
provider = "openai"
model = "gpt-5"

[[models.model_judge.panel]]
provider = "anthropic"
model = "claude-sonnet-4-6"

models.small remains a single concrete model. When it is unset and the default slot is "model_judge", it falls back to the model-judge default leaf instead of fanning out.

Telemetry

Panel responses, synthesis output, and ranking entries are emitted as structured tracing events on halter::model_judge.

Changes

  • config: adds ModelSlot and ModelJudgeConfig with default, synthesis, and non-empty panel validation.
  • providers: adds ModelJudgeProvider, panel collection, ranking telemetry, synthesis guidance, and fallback behavior.
  • builder: resolves model-judge slots into synthetic providers while sharing provider-family clients across members.
  • docs: updates the workspace README, halter-config README, and changelog for the model-judge config and flow.

Testing

  • cargo test -q -p halter-config -p halter-providers -p halter --no-default-features
  • cargo test -q --workspace --all-features

Add a Meta message variant (neither user nor assistant) carried only
transiently to the default model. Providers render it as a framed
user-role turn; runtime sites treat it like a system message for
token estimation, pruning signal, and prompt rendering.
models.default and models.subagent accept either an inline model or
the string "fusion", which references a shared [models.fusion] block
of a default model, a synthesis/judge model, and panelist models.
Adds validation and provider-family enumeration across fusion leaves.
FusionProvider implements the Provider trait by multiplexing a request
to N panel members, having a synthesis member stack-rank (via a
rank_responses tool) and judge their responses, then handing the
synthesis to the default member as a meta message whose stream is
returned to the caller. Panel responses, the synthesis message, and the
stack rankings are emitted as structured tracing telemetry. Falls back
to the default member alone if panels or synthesis fail.
build_model_registry resolves default/subagent slots to either an
inline model or a FusionProvider built from [models.fusion]. Provider
clients are resolved once per family and shared across roles and fusion
members. The small slot stays a single concrete model (the fusion
default leaf when the default slot is fusion).
Add fusion slot documentation to the workspace README, the
halter-config README, and the changelog: the panel/judge/synthesis
flow, the [models.fusion] schema, and the halter::fusion tracing
telemetry.
@pbdeuchler pbdeuchler changed the title Add fusion (model-judge) provider Add model-judge provider Jun 16, 2026
Rename the fusion-facing API to model_judge, keep synthesis guidance inside the provider boundary, and remove the protocol-level meta message path.
@pbdeuchler pbdeuchler force-pushed the claude/model-judge-feature-omhh6u branch from 104f54e to 95db268 Compare June 16, 2026 01:11
@pbdeuchler pbdeuchler merged commit bb10436 into master Jun 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant