-
Notifications
You must be signed in to change notification settings - Fork 377
Description
Enhanced Workflow with Scoring Mechanism for Model-Only Process
Scoring Criteria
The criteria for evaluation will remain the same:
Clarity: How clear and understandable the response is.
Relevance: How relevant the response is to the prompt.
Accuracy: How factually correct the response is.
Completeness: How thoroughly the response addresses the prompt.
Coherence: How logically consistent the response is.
Each criterion is scored from 0 to 10, and the overall score is the average of these scores. The minimum passing score will be set at 7.
Workflow Steps
Input Prompt:
The initial input is fed into the first layer.
Layer 1:
Three agents
𝐴
1
,
1
A
1,1
,
𝐴
1
,
2
A
1,2
, and
𝐴
1
,
3
A
1,3
process the input independently.
Intermediate outputs are generated and concatenated.
Critique 1 with Scoring:
A critique agent evaluates the concatenated output using the criteria (Clarity, Relevance, Accuracy, Completeness, Coherence).
Each criterion is scored from 0 to 10.
The overall score is the average of the criteria scores.
If the overall score is >= 7, the output is passed to Layer 2.
If the overall score is < 7, the output is sent back to Layer 1 for revision by the agents.
Layer 2:
The adjusted output from Critique 1 is processed by agents
𝐴
2
,
1
A
2,1
,
𝐴
2
,
2
A
2,2
, and
𝐴
2
,
3
A
2,3
.
Intermediate outputs are generated and concatenated.
Critique 2 with Scoring:
A critique agent evaluates the outputs from Layer 2 using the same criteria.
Outputs are scored and averaged.
If the overall score is >= 7, the output is passed to Layer 3.
If the overall score is < 7, the output is sent back to Layer 2 for revision by the agents.
Layer 3:
The adjusted output from Critique 2 is processed by agents
𝐴
3
,
1
A
3,1
,
𝐴
3
,
2
A
3,2
, and
𝐴
3
,
3
A
3,3
.
Intermediate outputs are generated and concatenated.
Critique 3 with Scoring:
A final critique agent evaluates the outputs from Layer 3.
Outputs are scored and averaged.
If the overall score is >= 7, the output is passed to Layer 4.
If the overall score is < 7, the output is sent back to Layer 3 for revision by the agents.
Layer 4:
The final adjusted output is processed by agent
𝐴
4
,
1
A
4,1
.
The Final Output is produced.
Final Output:
The output from Layer 4 is the final output, having passed all critique evaluations and scoring criteria.
Diagram Summary:
Input Prompt -> Layer 1 -> Critique 1 with Scoring -> (Pass if score >= 7 or Revise if score < 7) -> Layer 2 -> Critique 2 with Scoring -> (Pass or Revise) -> Layer 3 -> Critique 3 with Scoring -> (Pass or Revise) -> Layer 4 -> Final Output
Example Diagram Description:
Input Prompt: Initial input is fed into Layer 1.
Layer 1: Agents
𝐴
1
,
1
A
1,1
,
𝐴
1
,
2
A
1,2
, and
𝐴
1
,
3
A
1,3
process the input independently, generating intermediate outputs which are concatenated.
Critique 1 with Scoring: A critique agent evaluates the concatenated output, scoring it on clarity, relevance, accuracy, completeness, and coherence. If the score is >= 7, the output passes to Layer 2; otherwise, it is sent back to Layer 1.
Layer 2: Agents
𝐴
2
,
1
A
2,1
,
𝐴
2
,
2
A
2,2
, and
𝐴
2
,
3
A
2,3
process the adjusted output, generating new intermediate outputs which are concatenated.
Critique 2 with Scoring: The critique agent evaluates the new outputs, scoring them as before. Outputs scoring >= 7 pass to Layer 3; others are sent back to Layer 2.
Layer 3: Agents
𝐴
3
,
1
A
3,1
,
𝐴
3
,
2
A
3,2
, and
𝐴
3
,
3
A
3,3
process the further adjusted output, generating final intermediate outputs which are concatenated.
Critique 3 with Scoring: The final critique agent evaluates and scores the outputs. Outputs scoring >= 7 pass to Layer 4; others are sent back to Layer 3.
Layer 4: The final agent
𝐴
4
,
1
A
4,1
processes the output to produce the final answer.
This workflow ensures each layer's output meets a quality threshold before advancing, thereby enhancing the final output's overall quality.