Skip to content

add a critique model to the mix to improve the answers  #23

@Aaronminer1

Description

@Aaronminer1

Enhanced Workflow with Scoring Mechanism for Model-Only Process
Scoring Criteria
The criteria for evaluation will remain the same:

Clarity: How clear and understandable the response is.
Relevance: How relevant the response is to the prompt.
Accuracy: How factually correct the response is.
Completeness: How thoroughly the response addresses the prompt.
Coherence: How logically consistent the response is.
Each criterion is scored from 0 to 10, and the overall score is the average of these scores. The minimum passing score will be set at 7.

Workflow Steps
Input Prompt:

The initial input is fed into the first layer.
Layer 1:

Three agents
𝐴
1
,
1
A
1,1

,
𝐴
1
,
2
A
1,2

, and
𝐴
1
,
3
A
1,3

process the input independently.
Intermediate outputs are generated and concatenated.
Critique 1 with Scoring:

A critique agent evaluates the concatenated output using the criteria (Clarity, Relevance, Accuracy, Completeness, Coherence).
Each criterion is scored from 0 to 10.
The overall score is the average of the criteria scores.
If the overall score is >= 7, the output is passed to Layer 2.
If the overall score is < 7, the output is sent back to Layer 1 for revision by the agents.
Layer 2:

The adjusted output from Critique 1 is processed by agents
𝐴
2
,
1
A
2,1

,
𝐴
2
,
2
A
2,2

, and
𝐴
2
,
3
A
2,3

.
Intermediate outputs are generated and concatenated.
Critique 2 with Scoring:

A critique agent evaluates the outputs from Layer 2 using the same criteria.
Outputs are scored and averaged.
If the overall score is >= 7, the output is passed to Layer 3.
If the overall score is < 7, the output is sent back to Layer 2 for revision by the agents.
Layer 3:

The adjusted output from Critique 2 is processed by agents
𝐴
3
,
1
A
3,1

,
𝐴
3
,
2
A
3,2

, and
𝐴
3
,
3
A
3,3

.
Intermediate outputs are generated and concatenated.
Critique 3 with Scoring:

A final critique agent evaluates the outputs from Layer 3.
Outputs are scored and averaged.
If the overall score is >= 7, the output is passed to Layer 4.
If the overall score is < 7, the output is sent back to Layer 3 for revision by the agents.
Layer 4:

The final adjusted output is processed by agent
𝐴
4
,
1
A
4,1

.
The Final Output is produced.
Final Output:
The output from Layer 4 is the final output, having passed all critique evaluations and scoring criteria.
Diagram Summary:
Input Prompt -> Layer 1 -> Critique 1 with Scoring -> (Pass if score >= 7 or Revise if score < 7) -> Layer 2 -> Critique 2 with Scoring -> (Pass or Revise) -> Layer 3 -> Critique 3 with Scoring -> (Pass or Revise) -> Layer 4 -> Final Output
Example Diagram Description:
Input Prompt: Initial input is fed into Layer 1.
Layer 1: Agents
𝐴
1
,
1
A
1,1

,
𝐴
1
,
2
A
1,2

, and
𝐴
1
,
3
A
1,3

process the input independently, generating intermediate outputs which are concatenated.
Critique 1 with Scoring: A critique agent evaluates the concatenated output, scoring it on clarity, relevance, accuracy, completeness, and coherence. If the score is >= 7, the output passes to Layer 2; otherwise, it is sent back to Layer 1.
Layer 2: Agents
𝐴
2
,
1
A
2,1

,
𝐴
2
,
2
A
2,2

, and
𝐴
2
,
3
A
2,3

process the adjusted output, generating new intermediate outputs which are concatenated.
Critique 2 with Scoring: The critique agent evaluates the new outputs, scoring them as before. Outputs scoring >= 7 pass to Layer 3; others are sent back to Layer 2.
Layer 3: Agents
𝐴
3
,
1
A
3,1

,
𝐴
3
,
2
A
3,2

, and
𝐴
3
,
3
A
3,3

process the further adjusted output, generating final intermediate outputs which are concatenated.
Critique 3 with Scoring: The final critique agent evaluates and scores the outputs. Outputs scoring >= 7 pass to Layer 4; others are sent back to Layer 3.
Layer 4: The final agent
𝐴
4
,
1
A
4,1

processes the output to produce the final answer.
This workflow ensures each layer's output meets a quality threshold before advancing, thereby enhancing the final output's overall quality.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions