Add model-judge provider#153
Merged
Merged
Conversation
Validate positive tokens_per_minute
Add a Meta message variant (neither user nor assistant) carried only transiently to the default model. Providers render it as a framed user-role turn; runtime sites treat it like a system message for token estimation, pruning signal, and prompt rendering.
models.default and models.subagent accept either an inline model or the string "fusion", which references a shared [models.fusion] block of a default model, a synthesis/judge model, and panelist models. Adds validation and provider-family enumeration across fusion leaves.
FusionProvider implements the Provider trait by multiplexing a request to N panel members, having a synthesis member stack-rank (via a rank_responses tool) and judge their responses, then handing the synthesis to the default member as a meta message whose stream is returned to the caller. Panel responses, the synthesis message, and the stack rankings are emitted as structured tracing telemetry. Falls back to the default member alone if panels or synthesis fail.
build_model_registry resolves default/subagent slots to either an inline model or a FusionProvider built from [models.fusion]. Provider clients are resolved once per family and shared across roles and fusion members. The small slot stays a single concrete model (the fusion default leaf when the default slot is fusion).
Add fusion slot documentation to the workspace README, the halter-config README, and the changelog: the panel/judge/synthesis flow, the [models.fusion] schema, and the halter::fusion tracing telemetry.
Rename the fusion-facing API to model_judge, keep synthesis guidance inside the provider boundary, and remove the protocol-level meta message path.
104f54e to
95db268
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds model-judge slots for
models.defaultandmodels.subagent. A model-judge slot is implemented as a syntheticModelJudgeProvider, so the runtime still sees an ordinaryProviderand the fanout/judging/default-model handoff stays inside the provider boundary.Flow
models.model_judge.panelconcurrently.models.model_judge.synthesismodel receives the labeled panel responses plus the injectedrank_responsestool. It emits ranking telemetry, then writes synthesis guidance that judges the candidate responses rather than merging them directly.models.model_judge.default. The default model's stream is returned to the caller and still owns all real tool execution.The synthesis guidance is not persisted to the session transcript and does not require a protocol-level message variant.
Config
models.defaultandmodels.subagentaccept either the existing inline model table or the string"model_judge", which references one shared[models.model_judge]block:models.smallremains a single concrete model. When it is unset and the default slot is"model_judge", it falls back to the model-judge default leaf instead of fanning out.Telemetry
Panel responses, synthesis output, and ranking entries are emitted as structured
tracingevents onhalter::model_judge.Changes
ModelSlotandModelJudgeConfigwithdefault,synthesis, and non-emptypanelvalidation.ModelJudgeProvider, panel collection, ranking telemetry, synthesis guidance, and fallback behavior.halter-configREADME, and changelog for the model-judge config and flow.Testing
cargo test -q -p halter-config -p halter-providers -p halter --no-default-featurescargo test -q --workspace --all-features