-
Notifications
You must be signed in to change notification settings - Fork 6
Iris: Calculate verbalized confidence
#507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| {# Confidence scoring addon — basic_probscore method (Yang et al. 2024) for small models. | ||
| This section is appended to the main system prompt. #} | ||
| --- | ||
|
|
||
| ## Confidence Scoring | ||
|
|
||
| After your answer, state the probability between 0.0 and 1.0 that your answer is correct. | ||
|
|
||
| **Output format — you MUST follow this exactly:** | ||
|
|
||
| Answer: <your response to the student> | ||
| Probability: <a single decimal between 0.0 and 1.0> | ||
|
|
||
| Do not include any text after the Probability line. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| {# Confidence scoring addon — combo method (Yang et al. 2024) for large models. | ||
| This section is appended to the main system prompt. #} | ||
| --- | ||
|
|
||
| ## Confidence Scoring | ||
|
|
||
| Your response must include a probability that your answer is correct, expressed as a decimal between 0.0 and 1.0. | ||
|
|
||
| When assigning the probability, consider: | ||
| - **Task difficulty**: Is the question straightforward or does it require deep reasoning? | ||
| - **Knowledge availability**: Did you have access to sufficient, reliable information (via tools or training) to answer confidently? | ||
| - **Uncertainty in the question**: Is the question ambiguous or could it be interpreted in multiple ways? | ||
|
|
||
| Do not anchor on a comfortable middle value. Calibrate honestly: if you are nearly certain, use a high value; if you are mostly guessing, use a low value. | ||
|
|
||
| Here are examples of how to format your response: | ||
|
|
||
| --- | ||
| **Example 1** (very low confidence — topic outside course scope with no tool access): | ||
| Guess: I'm not sure this is covered in the course materials, but binary search trees store elements such that each node's left subtree contains only smaller values and the right subtree only larger values, enabling O(log n) average-case search. | ||
| Probability: 0.08 | ||
|
|
||
| --- | ||
| **Example 2** (low confidence — question is vague and tools returned limited information): | ||
| Guess: The submission deadline is likely the end of the semester, but I couldn't find a specific date in the course FAQ or exercise details. I recommend checking the course announcements. | ||
| Probability: 0.24 | ||
|
|
||
| --- | ||
| **Example 3** (moderate confidence — general knowledge, no direct course evidence): | ||
| Guess: The gradient descent algorithm updates model parameters by moving in the direction of the negative gradient of the loss function with respect to those parameters, scaled by a learning rate. | ||
| Probability: 0.47 | ||
|
|
||
| --- | ||
| **Example 4** (high confidence — answered from retrieved lecture content): | ||
| Guess: According to the lecture slides, a mutex (mutual exclusion lock) ensures that only one thread can access a critical section at a time, preventing race conditions. | ||
| Probability: 0.77 | ||
|
|
||
| --- | ||
| **Example 5** (very high confidence — directly found in course FAQ): | ||
| Guess: Yes, you can submit up to 3 days late with a 10% penalty per day, as stated in the course FAQ. | ||
| Probability: 0.89 | ||
|
|
||
| --- | ||
|
|
||
| **Output format — you MUST follow this exactly:** | ||
|
|
||
| Guess: <your best response to the student> | ||
| Probability: <a single decimal between 0.0 and 1.0> | ||
|
|
||
| Do not include any text after the Probability line. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,87 @@ | ||
| import re | ||
|
|
||
| _LARGE_MODEL_PATTERNS = [ | ||
| "70b", | ||
| "72b", | ||
| "110b", | ||
| "32b", | ||
| "gpt-4", | ||
| "gpt-5", | ||
| "gpt-oss", | ||
| ] | ||
|
|
||
| _ANSWER_PREFIX_RE = re.compile( | ||
| r"^(?:answer|guess)\s*:\s*", | ||
| re.IGNORECASE, | ||
| ) | ||
|
|
||
| _PROBABILITY_LINE_RE = re.compile( | ||
| r"(?:probability|confidence|p)\s*:\s*(-?\d+(?:\.\d+)?)(\s*%)?", | ||
| re.IGNORECASE, | ||
| ) | ||
|
Comment on lines
+12
to
+15
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probability parsing is too permissive and can misread normal answer text as confidence. Using an unanchored regex with Proposed fix _PROBABILITY_LINE_RE = re.compile(
- r"(?:probability|confidence|p)\s*:\s*(-?\d+(?:\.\d+)?)(\s*%)?",
+ r"^\s*(?:probability|confidence|p)\s*:\s*(-?(?:\d+(?:\.\d+)?|\.\d+))\s*(%)?\s*$",
re.IGNORECASE,
)
@@
- m = _PROBABILITY_LINE_RE.search(lines[i])
+ m = _PROBABILITY_LINE_RE.match(lines[i])Also applies to: 55-56 🤖 Prompt for AI Agents |
||
|
|
||
|
|
||
| def is_large_model(model_id: str) -> bool: | ||
| """Return True if the model should use the combo confidence prompt. | ||
|
|
||
| Large models include any GPT-4/GPT-5 generation model (including mini | ||
| variants and gpt-oss) and open-source models with ≥32B parameters. | ||
| Everything else is treated as small. | ||
| """ | ||
| lower = model_id.lower() | ||
| return any(pattern in lower for pattern in _LARGE_MODEL_PATTERNS) | ||
|
|
||
|
|
||
| def parse_confidence_response(raw_response: str) -> tuple[str, float]: | ||
| """Extract (answer_text, probability) from a verbalized confidence response. | ||
|
|
||
| Handles both large-model format (Guess: ... / Probability: ...) and | ||
| small-model format (Answer: ... / Probability: ...). Also accepts | ||
| "Confidence:" and "P:" as alternatives to "Probability:", and values | ||
| expressed as percentages (e.g. "85%" → 0.85). The probability is | ||
| clamped to [0.0, 1.0]. | ||
|
|
||
| If parsing fails for any reason this function returns (raw_response, 0.0) | ||
| so that callers never receive an exception. A score of 0.0 will be | ||
| treated as below threshold and discarded by Artemis. | ||
| """ | ||
| try: | ||
| lines = raw_response.strip().splitlines() | ||
|
|
||
| # Find the last line that matches a probability pattern. | ||
| prob_line_index = None | ||
| probability = 0.0 | ||
| for i in range(len(lines) - 1, -1, -1): | ||
| m = _PROBABILITY_LINE_RE.search(lines[i]) | ||
| if m: | ||
| prob_line_index = i | ||
| raw_value = float(m.group(1)) | ||
| is_percent = bool(m.group(2) and m.group(2).strip() == "%") | ||
| if is_percent: | ||
| probability = raw_value / 100.0 | ||
| else: | ||
| probability = raw_value | ||
| probability = max(0.0, min(1.0, probability)) | ||
| break | ||
|
|
||
| if prob_line_index is None: | ||
| # No probability line found — safe fallback. | ||
| return raw_response, 0.0 | ||
|
|
||
| # Everything before the probability line is the answer block. | ||
| answer_lines = lines[:prob_line_index] | ||
|
|
||
| # Strip the "Answer:" / "Guess:" prefix from the first line if present. | ||
| if answer_lines: | ||
| answer_lines[0] = _ANSWER_PREFIX_RE.sub("", answer_lines[0]) | ||
|
|
||
| answer_text = "\n".join(answer_lines).strip() | ||
|
|
||
| # If nothing is left after stripping, fall back to the raw response. | ||
| if not answer_text: | ||
| answer_text = raw_response | ||
|
|
||
| return answer_text, probability | ||
|
|
||
| except Exception: # pylint: disable=broad-except | ||
| return raw_response, 0.0 | ||
Uh oh!
There was an error while loading. Please reload this page.