Skip to content

Iris: Global Search intent routing with SetFit classifier and async webhook#523

Open
Nayer-kotry wants to merge 19 commits intomainfrom
iris/feature/iris-answer-intent-routing
Open

Iris: Global Search intent routing with SetFit classifier and async webhook#523
Nayer-kotry wants to merge 19 commits intomainfrom
iris/feature/iris-answer-intent-routing

Conversation

@Nayer-kotry
Copy link
Copy Markdown
Contributor

@Nayer-kotry Nayer-kotry commented Apr 27, 2026

Summary

  • Intent classification: A SetFit model (paraphrase-multilingual-MiniLM-L12-v2, INT8-quantized ONNX) classifies every query as trigger_ai (student wants an explanation) or skip_ai (bare keyword/navigation). Classification runs in ~2ms on CPU.
  • Smart routing: skip_ai queries skip HyDE and LLM entirely — sources are returned immediately with answer: null. trigger_ai queries go through the full HyDE + LLM pipeline.
  • Async webhook pattern: POST /api/v1/pipelines/global-search/run returns 202 immediately. Two callbacks are sent to Artemis at pipelines/global-search/runs/{runId}/status — one when thinking starts, one when done.
  • Rename: All LectureSearch* pipeline classes, DTOs, and prompts renamed to GlobalSearch* to reflect that this feature serves the Global Search modal, not just lecture search.
  • Tokenizer: Loaded from HuggingFace Hub at runtime (cached after first use) — not committed to the repo.

Artemis counterpart: ls1intum/Artemis#12612

Local setup

1. Install Git LFS (required — the ONNX model is stored in LFS)

# Mac
brew install git-lfs && git lfs install

# Linux
apt install git-lfs && git lfs install  # or your package manager

If Git LFS is already installed on your machine, pulling the branch is enough — it downloads the model automatically. If it is not installed, you will get a 134-byte pointer file instead of the model and Iris will crash on the first request.

2. Install dependencies

poetry install

3. Update application.local.yml

Rename the pipeline key:

global_search_pipeline:    # was: lecture_search_answer_pipeline
  default:
    hyde:
      cloud: <your-model>
    answer:
      cloud: <your-model>
    embedding: <your-embedding-model>

4. Run Iris normally

On the first request, the tokenizer (~17MB) downloads from HuggingFace Hub and is cached locally. All subsequent startups load from cache instantly.

Test plan

  • Send a conceptual query (e.g. "what is deep learning") — expect trigger_ai, LLM answer + sources
  • Send a bare keyword (e.g. "recursion") — expect skip_ai, sources only, answer: null
  • Send a bare verb (e.g. "explain") — expect skip_ai, no LLM call
  • Verify callback URL hits pipelines/global-search/runs/{runId}/status on Artemis side

Closes IRIS-264

Summary by CodeRabbit

  • New Features

    • Global search pipeline with intelligent intent classification—automatically determines whether to apply AI processing or return sources only.
    • New /global-search/run endpoint with asynchronous operations and real-time status updates.
  • Bug Fixes

    • Improved handling of ungrounded responses to return null answers instead of explanatory text.
  • Chores

    • Removed legacy /search/ask endpoint; updated configuration and dependencies.

- SetFit model trained on 2,506 examples (1,253 per class) using
  paraphrase-multilingual-MiniLM-L12-v2 as backbone
- Exported to dynamic INT8-quantized ONNX for fast CPU inference (~2ms)
- Singleton classifier loaded via lru_cache with TRIGGER_AI fallback
  when model files are absent
- .gitattributes tracks *.onnx via Git LFS
- .gitignore excludes intermediate training artifacts
- .secrets.baseline updated with tokenizer.json false positive
…time

Tokenizer files (tokenizer.json, tokenizer_config.json, special_tokens_map.json)
are identical to the public paraphrase-multilingual-MiniLM-L12-v2 tokenizer on
HuggingFace Hub. Committing them caused a detect-secrets false positive that
required modifying .secrets.baseline — a security risk. Instead, the classifier
now loads the tokenizer from the Hub using the base_model name stored in meta.json.
HuggingFace caches the download locally after the first use.
…t routing and refusal suppression

- Pipeline skips HyDE and LLM entirely when intent is SKIP_AI, returning sources only with answer=null
- LLM refusal safety net: regex catches "not covered/mentioned/provided" strings and converts to null
- Updated all prompts from "lecture content" to "course content"
- implementation_id changed from lecture_search_answer_pipeline to global_search_pipeline
- Callback URL: {base_url}/api/iris/internal/pipelines/global-search/runs/{run_id}/status
- thinking() logs intent decision and sends in_progress to Artemis
- done() sends answer (str or null) and sources list
- GlobalSearchStatusUpdateDTO with answer, sources, and result fields
- POST /api/v1/pipelines/global-search/run replaces the old synchronous ask endpoint
- search.py stripped back to only the synchronous lecture search endpoint
- lecture_global_search_retrieval.py updated to use global_search_pipeline ID
- application.example.yml renamed lecture_search_answer_pipeline to global_search_pipeline
@Nayer-kotry Nayer-kotry requested a review from a team as a code owner April 27, 2026 02:39
@github-actions github-actions Bot added the iris label Apr 27, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7eb1464e-38af-4985-8666-97f6519b2839

📥 Commits

Reviewing files that changed from the base of the PR and between 21cabd1 and fa34433.

📒 Files selected for processing (1)
  • iris/src/iris/web/routers/pipelines.py

📝 Walkthrough

Walkthrough

This PR introduces intent-based global search by adding a trained ONNX intent classifier, restructuring DTOs from lecture-specific to global search, updating the pipeline to conditionally skip AI processing based on intent, adding a new /global-search/run API endpoint with background processing, and removing the legacy /search/ask endpoint.

Changes

Global Search Intent Classification & Restructuring

Layer / File(s) Summary
Infrastructure & Dependencies
.gitattributes, pyproject.toml
Git LFS rules added for *.safetensors and *.onnx files; runtime and training dependencies added for ONNX/transformers inference and SetFit training.
Model Configuration & Training
iris/models/global_search_intent/train.py, iris/models/global_search_intent/onnx/config.json, iris/models/global_search_intent/onnx/meta.json, iris/models/global_search_intent/onnx/ort_config.json, iris/models/global_search_intent/onnx/pooling_config.json, iris/models/global_search_intent/onnx/model_quantized.onnx, iris/models/global_search_intent/.gitignore
SetFit-based binary intent classifier training script that exports to optimized/quantized ONNX; model configs define BERT architecture, pooling strategy, runtime settings, and evaluation metrics; LFS pointer tracks quantized model artifact.
Intent Classification Module
iris/src/iris/pipeline/shared/global_search_intent_classifier.py, iris/src/iris/domain/search/search_intent_dto.py
New lazy-loaded ONNX Runtime classifier module with classify(query) function; SearchIntent enum defines TRIGGER_AI and SKIP_AI states.
Data Shape & DTOs
iris/src/iris/domain/search/lecture_search_dto.py, iris/src/iris/domain/status/global_search_status_update_dto.py
Request/response DTOs renamed to GlobalSearchRequestDTO/GlobalSearchResponseDTO with added settings and optional initial_stages; response answer becomes nullable; new GlobalSearchStatusUpdateDTO for webhook status updates with result, answer, and sources fields.
Pipeline Core Logic
iris/src/iris/pipeline/global_search_pipeline.py, iris/src/iris/pipeline/prompts/global_search_prompts.py
Pipeline renamed to GlobalSearchPipeline with optional intent parameter; auto-classifies query when intent omitted; skips HyDE and generation for SKIP_AI, returning only sources with answer=None; normalizes empty/null answers to None; adds regex safety to suppress short refusal patterns; prompt updated to reference course (not lecture) content and return null-answer JSON for unrelated queries.
Configuration Updates
iris/application.example.yml, iris/src/iris/retrieval/lecture/lecture_global_search_retrieval.py
Pipeline configuration key renamed from lecture_search_answer_pipeline to global_search_pipeline in YAML and retrieval module.
API Endpoints & Callbacks
iris/src/iris/web/routers/pipelines.py, iris/src/iris/web/routers/search.py, iris/src/iris/web/status/status_update.py
New POST /api/v1/pipelines/global-search/run endpoint spawns background worker that classifies intent, conditionally sends "thinking" update, executes pipeline, and reports results via GlobalSearchCallback; legacy POST /api/v1/search/ask removed; new GlobalSearchCallback manages status lifecycle with "Thinking" and "Responding" stages.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 A classifier learns what queries seek—
Skip or trigger, strength or weak!
From ONNX models, pooling streams,
Global search fulfills the dreams.
No more lectures, now the whole course gleams! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately captures the main changes: adding intent routing for Global Search using a SetFit classifier and implementing async webhook functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch iris/feature/iris-answer-intent-routing

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Pylint (4.0.5)
iris/src/iris/web/routers/pipelines.py

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@Nayer-kotry Nayer-kotry changed the title Iris: Global Search intent routing with SetFit classifier and async webhook Iris: Global Search intent routing with SetFit classifier and async webhook Apr 27, 2026
@Nayer-kotry Nayer-kotry marked this pull request as draft April 27, 2026 02:40
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
iris/src/iris/domain/search/lecture_search_dto.py (1)

49-60: 🛠️ Refactor suggestion | 🟠 Major

initial_stages is declared in the request DTO but never consumed by the pipeline.

GlobalSearchRequestDTO.initial_stages is defined here, but GlobalSearchCallback.__init__ (lines 538–551 in iris/src/iris/web/status/status_update.py) accepts only run_id and base_url—not initial_stages. It builds a fixed [Thinking, Responding] stages list with no support for prepending caller-provided stages, unlike every other callback (CourseChatStatusCallback, ExerciseChatStatusCallback, LectureChatCallback, etc.), which all accept and use initial_stages.

In run_global_search_pipeline_worker (pipelines.py:476–486), the callback is instantiated without passing initial_stages, whereas all other pipeline workers pass initial_stages=dto.initial_stages.

Either:

  • Add initial_stages parameter to GlobalSearchCallback.__init__ and pass it when instantiating the callback, or
  • Remove the field from GlobalSearchRequestDTO to avoid a misleading API contract.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/src/iris/domain/search/lecture_search_dto.py` around lines 49 - 60,
GlobalSearchRequestDTO.initial_stages is never used; update
GlobalSearchCallback.__init__ to accept an initial_stages: List[StageDTO] (or
Optional[List[StageDTO]]) parameter, store it on the instance, and use it when
constructing the stages list instead of always using a fixed [Thinking,
Responding] sequence; then update run_global_search_pipeline_worker to pass
dto.initial_stages into GlobalSearchCallback(...) (mirror how
CourseChatStatusCallback/ExerciseChatStatusCallback/LectureChatCallback handle
initial_stages) so caller-provided stages are prepended/used.
🧹 Nitpick comments (9)
iris/src/iris/web/status/status_update.py (1)

565-579: done() override silently swallows parent kwargs and ignores delivery success.

Two small concerns worth tightening:

  1. The **_kwargs catch-all means a future caller invoking the parent's contract (done(message=..., final_result=..., tokens=...) etc.) on a GlobalSearchCallback instance will have their args silently dropped — there's no signal that those are unsupported here. Either declare the parameters you actually accept (answer, sources, tokens) without **_kwargs, or assert/log when unexpected kwargs arrive.
  2. self.on_status_update() returns a bool indicating delivery success, but it's discarded. The base class uses that signal to decide whether to clear transient fields and to drive retry semantics; here a failed POST silently leaves the stages marked DONE without any retry path.

Also consider documenting in the docstring that this collapses both Thinking + Responding to DONE in a single call (intentional for the skip_ai path where thinking() was never invoked, so a NOT_STARTED → DONE transition is expected).

♻️ Suggested signature/return tightening
-    def done(self, answer=None, sources=None, tokens=None, **_kwargs):
-        """Advance to responding stage, attach result, and notify Artemis."""
+    def done(
+        self,
+        answer: Optional[str] = None,
+        sources: Optional[List["LectureSearchResultDTO"]] = None,
+        tokens: Optional[List[TokenUsageDTO]] = None,
+    ) -> bool:
+        """Advance to responding stage, attach result, and notify Artemis.
+
+        Collapses both Thinking and Responding stages to DONE in a single
+        webhook (intentional for the skip_ai path that bypasses thinking()).
+        """
         logger.info(
             "[ask] → callback: done  answer=%s  sources=%d",
             "present" if answer else "null",
             len(sources) if sources else 0,
         )
         self.stage.state = StageStateEnum.DONE
         self.stage = self.get_next_stage()
         if self.stage:
             self.stage.state = StageStateEnum.DONE
         self.status.answer = answer
         self.status.sources = sources or []
         self.status.tokens = tokens or []
-        self.on_status_update()
+        return self.on_status_update()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/src/iris/web/status/status_update.py` around lines 565 - 579, The done()
override on GlobalSearchCallback currently swallows unexpected kwargs and
ignores delivery success; change the signature to accept only the explicit
params (answer=None, sources=None, tokens=None) instead of **_kwargs or, if you
must accept extras, explicitly assert/log when unexpected kwargs are passed so
callers are not silently dropped; capture the boolean return from
self.on_status_update() and propagate or act on it (e.g., if delivery failed, do
not permanently mark StageStateEnum.DONE or schedule a retry/raise so the
base-class retry/cleanup behavior can run) — update the docstring to state that
this method collapses Thinking + Responding into DONE for the skip_ai path and
reference the methods done() and on_status_update() and the StageStateEnum
transitions so reviewers can find the change.
iris/src/iris/pipeline/shared/global_search_intent_classifier.py (2)

100-101: Defensive _LABELS lookup.

_LABELS[pred_class] will raise KeyError if the joblib head ever returns a class id outside {0, 1} (e.g. if a future multi-class head is loaded by mistake). Since the surrounding classify() doesn't catch — and the upstream pipeline only wraps the call in a broad try/except — a single bad model artifact would surface as a noisy 500 in the pipeline worker. .get() with the safe default keeps behavior consistent with the rest of the module (which already biases toward TRIGGER_AI whenever the classifier is unsure).

♻️ Proposed change
-        pred_class = int(self._head.predict(embedding)[0])
-        return _LABELS[pred_class]
+        pred_class = int(self._head.predict(embedding)[0])
+        return _LABELS.get(pred_class, SearchIntent.TRIGGER_AI)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/src/iris/pipeline/shared/global_search_intent_classifier.py` around
lines 100 - 101, The lookup of _LABELS[pred_class] is unsafe: change the return
to use _LABELS.get(pred_class, _LABELS[TRIGGER_AI]) after obtaining pred_class
(ensure pred_class is an int and handle empty/zero-length predict outputs before
indexing), so replace the direct index in the classify flow (where
self._head.predict(...) -> pred_class) with a .get(...) that falls back to the
TRIGGER_AI label to maintain existing bias when the head returns an out-of-range
or unexpected class id.

70-75: Hardcoded intra_op_num_threads = 4 may starve or oversubscribe pods.

The setting is fixed at 4 regardless of how many cores the pod actually has. Per ONNX Runtime's threading docs, the default (0 / unset) sizes the intra-op pool to the number of physical CPU cores, which is usually the right choice on a server. Hardcoding 4 will:

  • Oversubscribe small pods (e.g. 1–2 vCPU containers running multiple Iris pipelines in parallel), increasing context-switching and latency variance.
  • Underutilize larger hosts where >4 cores are available for a single inference.

Consider either leaving the default (0) or making it configurable via env var (e.g. INTENT_INTRA_OP_THREADS).

♻️ Proposed change
-        sess_opts.intra_op_num_threads = 4
+        # 0 = let ORT pick (= number of physical cores). Override via env if needed.
+        sess_opts.intra_op_num_threads = int(os.environ.get("INTENT_INTRA_OP_THREADS", "0"))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/src/iris/pipeline/shared/global_search_intent_classifier.py` around
lines 70 - 75, Replace the hardcoded sess_opts.intra_op_num_threads = 4 with a
configurable or default behavior: read an INTENT_INTRA_OP_THREADS env var, parse
it to an int (fall back to 0 when unset/invalid) and assign that value to
sess_opts.intra_op_num_threads, or simply leave it at 0 to let ONNX Runtime pick
physical cores; update the SessionOptions setup where sess_opts is created (the
block that calls ort.SessionOptions(), sets graph_optimization_level,
intra_op_num_threads, and constructs self._session via ort.InferenceSession) so
the code uses the parsed env var or 0 instead of the hardcoded 4.
iris/src/iris/domain/status/global_search_status_update_dto.py (1)

17-17: Use default_factory=list for sources to silence RUF012 and match sibling DTOs.

Pydantic v2 will safely deep-copy the literal [] default, so this is not a real bug — but Ruff flags it (RUF012), and the sibling GlobalSearchRequestDTO.initial_stages already uses Field(default_factory=list). Aligning here keeps the static-analysis output clean and the codebase consistent.

♻️ Proposed change
-from typing import List, Optional
+from typing import List, Optional
+
+from pydantic import Field
@@
-    sources: List[LectureSearchResultDTO] = []
+    sources: List[LectureSearchResultDTO] = Field(default_factory=list)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/src/iris/domain/status/global_search_status_update_dto.py` at line 17,
The field default for sources uses a mutable literal ([]), causing RUF012;
change the declaration in GlobalSearchStatusUpdateDTO from `sources:
List[LectureSearchResultDTO] = []` to use a factory: `sources:
List[LectureSearchResultDTO] = Field(default_factory=list)`, and add an import
for Field from pydantic if missing so it matches the pattern used by
GlobalSearchRequestDTO.initial_stages and silences the linter.
iris/src/iris/pipeline/global_search_pipeline.py (2)

172-174: Inconsistent log tag.

The router worker logs [global-search] ...; this line uses [ask], which is a leftover from the previous naming. Align for greppability.

Proposed fix
-            logger.info(
-                "[ask] LLM refusal detected in answer text — suppressing to null"
-            )
+            logger.info(
+                "[global-search] LLM refusal detected in answer text — suppressing to null"
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/src/iris/pipeline/global_search_pipeline.py` around lines 172 - 174, The
log line using logger.info currently tags messages with "[ask]" which is
inconsistent; update the logger.info call in global_search_pipeline.py (the
logger.info that logs "LLM refusal detected in answer text — suppressing to
null") to use the "[global-search]" tag so it matches the router worker's
convention and is greppable across the codebase.

109-111: Rename enum to IRIS_GLOBAL_SEARCH_ANSWER_PIPELINE for consistency with component refactoring.

The class, DTO, and prompts module were renamed from LectureSearch* to GlobalSearch*, but token accounting still uses PipelineEnum.IRIS_LECTURE_SEARCH_ANSWER_PIPELINE (lines 110, 179). Create a corresponding IRIS_GLOBAL_SEARCH_ANSWER_PIPELINE enum member to ensure telemetry is properly bucketed under the new pipeline name.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/src/iris/pipeline/global_search_pipeline.py` around lines 109 - 111,
Replace the old enum member PipelineEnum.IRIS_LECTURE_SEARCH_ANSWER_PIPELINE
with a new member PipelineEnum.IRIS_GLOBAL_SEARCH_ANSWER_PIPELINE and update all
callsites that pass the enum to _append_tokens (e.g., the call using
self.hyde_llm.tokens in GlobalSearchPipeline / global_search_pipeline.py and the
other occurrence near the answer DTO usage) so telemetry/bucketing uses
IRIS_GLOBAL_SEARCH_ANSWER_PIPELINE; add the new enum entry to the PipelineEnum
definition, run a project-wide find/replace for the old member name, and ensure
imports/types compile after the rename.
iris/models/global_search_intent/train.py (3)

200-204: Drop the no-op "sanity check" read.

(BACKBONE_DIR / "pooling_config.json").read_text() discards its result and only proves the file is readable — but you wrote it ~80 lines above with pathlib.Path.write_text, so this reads as dead code. The next line (204) authoritatively re-writes the file into FINAL_DIR, which is what actually matters for the inference dir.

Proposed cleanup
-joblib.dump(model.model_head, str(FINAL_DIR / "model_head.joblib"))
-(BACKBONE_DIR / "pooling_config.json").read_text()  # sanity check readable
-
-# Copy pooling config so the inference dir is self-contained
-(FINAL_DIR / "pooling_config.json").write_text(json.dumps(pooling_config, indent=2))
+joblib.dump(model.model_head, str(FINAL_DIR / "model_head.joblib"))
+
+# Copy pooling config so the inference dir is self-contained
+(FINAL_DIR / "pooling_config.json").write_text(json.dumps(pooling_config, indent=2))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/models/global_search_intent/train.py` around lines 200 - 204, Remove the
no-op read of the pooling config that discards its result: delete the standalone
call `(BACKBONE_DIR / "pooling_config.json").read_text()` and rely on the
existing write to FINAL_DIR via `(FINAL_DIR /
"pooling_config.json").write_text(json.dumps(pooling_config, indent=2))`; ensure
you keep the joblib.dump(model.model_head, str(FINAL_DIR / "model_head.joblib"))
line and that the variable names BACKBONE_DIR, FINAL_DIR and pooling_config
remain used so the inference directory is correctly self-contained.

18-34: Avoid pip install --upgrade as a side-effect at script load time.

Running subprocess.check_call([... "pip", "install", ..., "--upgrade"]) at the top of the file mutates the active environment every time the script is imported or executed, can silently bump versions of unrelated callers in the same venv, and is a build-/CI-hostile pattern. Pin the training extras in pyproject.toml (e.g. an [project.optional-dependencies] training group) and document pip install -e .[training] in the README, then remove this auto-installer block. This also lets you drop the nosec B404/B603 markers and the # noqa: E402 late imports below.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/models/global_search_intent/train.py` around lines 18 - 34, The
top-level dependency installation block (the pkgs list and the loop invoking
subprocess.check_call for pip installs) should be removed to avoid mutating the
runtime environment at import/execute; instead add these packages to
pyproject.toml as an optional dependency group (e.g.,
[project.optional-dependencies] training) and update README to instruct users to
run pip install -e .[training] (or similar) for training setup; also remove the
associated nosec B603/B404 and any noqa: E402 late-import comments that were
only needed because of the installer block.

105-106: Use parents=True on mkdir(exist_ok=True).

BACKBONE_DIR.mkdir(exist_ok=True) (and the same call on line 165 for FINAL_DIR) will raise FileNotFoundError if any intermediate directory (e.g. the script is invoked from a different cwd) is missing. Cheap to make robust:

Proposed fix
-BACKBONE_DIR.mkdir(exist_ok=True)
+BACKBONE_DIR.mkdir(parents=True, exist_ok=True)
-FINAL_DIR.mkdir(exist_ok=True)
+FINAL_DIR.mkdir(parents=True, exist_ok=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/models/global_search_intent/train.py` around lines 105 - 106, The mkdir
calls on BACKBONE_DIR and FINAL_DIR can raise FileNotFoundError if intermediate
directories don't exist; update their creation to call mkdir with parents=True
and keep exist_ok=True (i.e., change the BACKBONE_DIR.mkdir(...) and the
FINAL_DIR.mkdir(...) calls to include parents=True) so the entire directory path
is created atomically using the Path objects BACKBONE_DIR and FINAL_DIR.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@iris/models/global_search_intent/onnx/meta.json`:
- Line 10: The meta.json entry "onnx_file" is incorrect—update the value of the
onnx_file key in iris/models/global_search_intent/onnx/meta.json to match the
actual committed ONNX filename ("model_quantized.onnx") so metadata aligns with
the repository artifact; this keeps external tooling and maintainers consistent
with the model file that classifier._find_onnx_file() will pick up.

In `@iris/models/global_search_intent/train.py`:
- Around line 207-216: The meta dictionary creation writes eval metrics that may
be numpy scalars which json.dumps can't serialize; before serializing meta (the
variable meta) convert any numpy-typed values in metrics (returned from
trainer.evaluate()) to native Python types (e.g., via a helper that walks the
metrics dict and calls .item() for numpy scalars or uses json-safe casting) and
then write FINAL_DIR / "meta.json" using json.dumps on the cleaned meta so
serialization no longer raises; update the code around where metrics is set and
where meta is created/written to ensure all values are primitive Python types.

In `@iris/src/iris/pipeline/global_search_pipeline.py`:
- Around line 161-176: The refusal-suppression currently uses a broad re.search
on answer and clears used_sources; change it to only suppress when the entire
answer matches the refusal pattern and the answer is short/minimal (e.g., use
re.fullmatch on the existing regex and require a word-count or character-length
threshold like len(answer.split()) < 8 or len(answer) < 100), and do not clear
used_sources when suppressing — only set answer = None (keep used_sources
as-is); update the code that references answer, used_sources, and logger.info to
reflect these checks so legitimate grounded answers with embedded denial phrases
and citations aren’t discarded.

In `@iris/src/iris/pipeline/shared/global_search_intent_classifier.py`:
- Around line 107-134: The _get_classifier() function is currently cached with
`@lru_cache`(maxsize=1) and returns None on any Exception which gets permanently
memoized; update it so only successful _IntentClassifier instances are cached
and transient HF Hub failures are retried: remove or change the decorator usage
on _get_classifier(), create a small non-cached loader that attempts to
construct _IntentClassifier(model_dir) and on success stores/returns the
instance (e.g., by setting a module-level variable or a separate
`@lru_cache-wrapped` factory that only returns the instance), while preserving the
existing behavior that missing model_dir (from _model_dir()) remains a one-time
warning and returns None without being cached permanently; ensure classify()
still calls the new loader and that exceptions during _IntentClassifier(...) are
logged (logger.exception) but do not cause None to be cached for the lifetime of
the process.

In `@iris/src/iris/web/routers/pipelines.py`:
- Around line 490-499: The INFO log in the router currently emits raw user input
via dto.query[:120]; change it to avoid logging user queries at INFO by either
(a) removing dto.query from the logger.info call and only logging the decision
(intent) and the LLM branch, or (b) lowering the log level to DEBUG for the line
that includes dto.query so queries are only recorded when DEBUG is enabled
(matching the in-pipeline debug at global_search_pipeline.py:100). Update the
logger invocation in the router (the logger.info that references dto.query and
SearchIntent.TRIGGER_AI) accordingly so INFO logs contain only structural
information (intent and branch) and user input is not sent to standard log
shipping.

---

Outside diff comments:
In `@iris/src/iris/domain/search/lecture_search_dto.py`:
- Around line 49-60: GlobalSearchRequestDTO.initial_stages is never used; update
GlobalSearchCallback.__init__ to accept an initial_stages: List[StageDTO] (or
Optional[List[StageDTO]]) parameter, store it on the instance, and use it when
constructing the stages list instead of always using a fixed [Thinking,
Responding] sequence; then update run_global_search_pipeline_worker to pass
dto.initial_stages into GlobalSearchCallback(...) (mirror how
CourseChatStatusCallback/ExerciseChatStatusCallback/LectureChatCallback handle
initial_stages) so caller-provided stages are prepended/used.

---

Nitpick comments:
In `@iris/models/global_search_intent/train.py`:
- Around line 200-204: Remove the no-op read of the pooling config that discards
its result: delete the standalone call `(BACKBONE_DIR /
"pooling_config.json").read_text()` and rely on the existing write to FINAL_DIR
via `(FINAL_DIR / "pooling_config.json").write_text(json.dumps(pooling_config,
indent=2))`; ensure you keep the joblib.dump(model.model_head, str(FINAL_DIR /
"model_head.joblib")) line and that the variable names BACKBONE_DIR, FINAL_DIR
and pooling_config remain used so the inference directory is correctly
self-contained.
- Around line 18-34: The top-level dependency installation block (the pkgs list
and the loop invoking subprocess.check_call for pip installs) should be removed
to avoid mutating the runtime environment at import/execute; instead add these
packages to pyproject.toml as an optional dependency group (e.g.,
[project.optional-dependencies] training) and update README to instruct users to
run pip install -e .[training] (or similar) for training setup; also remove the
associated nosec B603/B404 and any noqa: E402 late-import comments that were
only needed because of the installer block.
- Around line 105-106: The mkdir calls on BACKBONE_DIR and FINAL_DIR can raise
FileNotFoundError if intermediate directories don't exist; update their creation
to call mkdir with parents=True and keep exist_ok=True (i.e., change the
BACKBONE_DIR.mkdir(...) and the FINAL_DIR.mkdir(...) calls to include
parents=True) so the entire directory path is created atomically using the Path
objects BACKBONE_DIR and FINAL_DIR.

In `@iris/src/iris/domain/status/global_search_status_update_dto.py`:
- Line 17: The field default for sources uses a mutable literal ([]), causing
RUF012; change the declaration in GlobalSearchStatusUpdateDTO from `sources:
List[LectureSearchResultDTO] = []` to use a factory: `sources:
List[LectureSearchResultDTO] = Field(default_factory=list)`, and add an import
for Field from pydantic if missing so it matches the pattern used by
GlobalSearchRequestDTO.initial_stages and silences the linter.

In `@iris/src/iris/pipeline/global_search_pipeline.py`:
- Around line 172-174: The log line using logger.info currently tags messages
with "[ask]" which is inconsistent; update the logger.info call in
global_search_pipeline.py (the logger.info that logs "LLM refusal detected in
answer text — suppressing to null") to use the "[global-search]" tag so it
matches the router worker's convention and is greppable across the codebase.
- Around line 109-111: Replace the old enum member
PipelineEnum.IRIS_LECTURE_SEARCH_ANSWER_PIPELINE with a new member
PipelineEnum.IRIS_GLOBAL_SEARCH_ANSWER_PIPELINE and update all callsites that
pass the enum to _append_tokens (e.g., the call using self.hyde_llm.tokens in
GlobalSearchPipeline / global_search_pipeline.py and the other occurrence near
the answer DTO usage) so telemetry/bucketing uses
IRIS_GLOBAL_SEARCH_ANSWER_PIPELINE; add the new enum entry to the PipelineEnum
definition, run a project-wide find/replace for the old member name, and ensure
imports/types compile after the rename.

In `@iris/src/iris/pipeline/shared/global_search_intent_classifier.py`:
- Around line 100-101: The lookup of _LABELS[pred_class] is unsafe: change the
return to use _LABELS.get(pred_class, _LABELS[TRIGGER_AI]) after obtaining
pred_class (ensure pred_class is an int and handle empty/zero-length predict
outputs before indexing), so replace the direct index in the classify flow
(where self._head.predict(...) -> pred_class) with a .get(...) that falls back
to the TRIGGER_AI label to maintain existing bias when the head returns an
out-of-range or unexpected class id.
- Around line 70-75: Replace the hardcoded sess_opts.intra_op_num_threads = 4
with a configurable or default behavior: read an INTENT_INTRA_OP_THREADS env
var, parse it to an int (fall back to 0 when unset/invalid) and assign that
value to sess_opts.intra_op_num_threads, or simply leave it at 0 to let ONNX
Runtime pick physical cores; update the SessionOptions setup where sess_opts is
created (the block that calls ort.SessionOptions(), sets
graph_optimization_level, intra_op_num_threads, and constructs self._session via
ort.InferenceSession) so the code uses the parsed env var or 0 instead of the
hardcoded 4.

In `@iris/src/iris/web/status/status_update.py`:
- Around line 565-579: The done() override on GlobalSearchCallback currently
swallows unexpected kwargs and ignores delivery success; change the signature to
accept only the explicit params (answer=None, sources=None, tokens=None) instead
of **_kwargs or, if you must accept extras, explicitly assert/log when
unexpected kwargs are passed so callers are not silently dropped; capture the
boolean return from self.on_status_update() and propagate or act on it (e.g., if
delivery failed, do not permanently mark StageStateEnum.DONE or schedule a
retry/raise so the base-class retry/cleanup behavior can run) — update the
docstring to state that this method collapses Thinking + Responding into DONE
for the skip_ai path and reference the methods done() and on_status_update() and
the StageStateEnum transitions so reviewers can find the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c197bd37-b536-4dbf-84ca-38a84e436b69

📥 Commits

Reviewing files that changed from the base of the PR and between 253dcba and 279fd79.

⛔ Files ignored due to path filters (1)
  • iris/models/global_search_intent/training_data.csv is excluded by !**/*.csv
📒 Files selected for processing (20)
  • iris/.gitattributes
  • iris/application.example.yml
  • iris/models/global_search_intent/.gitignore
  • iris/models/global_search_intent/onnx/config.json
  • iris/models/global_search_intent/onnx/meta.json
  • iris/models/global_search_intent/onnx/model_head.joblib
  • iris/models/global_search_intent/onnx/model_quantized.onnx
  • iris/models/global_search_intent/onnx/ort_config.json
  • iris/models/global_search_intent/onnx/pooling_config.json
  • iris/models/global_search_intent/train.py
  • iris/src/iris/domain/search/lecture_search_dto.py
  • iris/src/iris/domain/search/search_intent_dto.py
  • iris/src/iris/domain/status/global_search_status_update_dto.py
  • iris/src/iris/pipeline/global_search_pipeline.py
  • iris/src/iris/pipeline/prompts/global_search_prompts.py
  • iris/src/iris/pipeline/shared/global_search_intent_classifier.py
  • iris/src/iris/retrieval/lecture/lecture_global_search_retrieval.py
  • iris/src/iris/web/routers/pipelines.py
  • iris/src/iris/web/routers/search.py
  • iris/src/iris/web/status/status_update.py

Comment thread iris/models/global_search_intent/onnx/meta.json Outdated
Comment thread iris/src/iris/pipeline/global_search_pipeline.py Outdated
Comment thread iris/src/iris/pipeline/shared/global_search_intent_classifier.py Outdated
Comment thread iris/src/iris/web/routers/pipelines.py Outdated
- meta.json: correct onnx_file value to match committed filename
- train.py: coerce numpy eval metrics to float before json.dumps
- global_search_pipeline: narrow refusal-suppression to short answers only, keep used_sources on suppression
- global_search_intent_classifier: don't cache None on transient HF Hub failure, retry on next call
- pipelines router: lower query-logging to DEBUG to avoid PII in INFO logs
@Nayer-kotry Nayer-kotry marked this pull request as ready for review April 28, 2026 02:21
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@iris/models/global_search_intent/train.py`:
- Around line 207-216: The meta["onnx_file"] selection may pick a different
.onnx than the service's _find_onnx_file logic; update the code that builds meta
in train.py to use the same selection logic as
iris.pipeline.shared.global_search_intent_classifier._find_onnx_file (either
import and call _find_onnx_file or replicate its sorting/prefer-"quantized"
candidate rules) when choosing the ONNX filename from FINAL_DIR so the recorded
meta["onnx_file"] matches the artifact the runtime will actually load.
- Around line 18-34: The script currently auto-installs packages via the pkgs
list and the subprocess.check_call loop (install block using sys.executable and
pip), which undermines reproducibility; instead, add all training dependencies
(setfit, optimum[onnxruntime], onnxruntime, datasets, scikit-learn, joblib,
pandas) with pinned versions to your Poetry pyproject.toml (preferably a
dedicated "train" group), remove the dynamic installation loop from train.py,
and replace it with a fail-fast check that imports or verifies those packages
are present and raises a clear error if missing so CI/consumers must install via
Poetry rather than the script auto-upgrading them.

In `@iris/src/iris/web/routers/pipelines.py`:
- Around line 500-517: The current branch only calls callback.thinking() when
intent == SearchIntent.TRIGGER_AI, but the endpoint contract promises two
callbacks for all searches; update the handler so callback.thinking() is emitted
unconditionally before invoking GlobalSearchPipeline (i.e., call
callback.thinking() before or regardless of the SearchIntent check), then
proceed with intent-specific logic and finally callback.done(answer=...,
sources=...); alternatively, if you intend SKIP_AI to only send the final
webhook, update the endpoint docstring to remove the promise of an initial
status—modify the code paths around SearchIntent.TRIGGER_AI /
SearchIntent.SKIP_AI and the docstring consistently so callback.thinking() and
callback.done() behavior matches the documented contract.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 54be69be-2616-4c37-b1ac-8330d3171345

📥 Commits

Reviewing files that changed from the base of the PR and between 279fd79 and fb35d7a.

⛔ Files ignored due to path filters (1)
  • iris/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • iris/models/global_search_intent/onnx/meta.json
  • iris/models/global_search_intent/train.py
  • iris/pyproject.toml
  • iris/src/iris/pipeline/global_search_pipeline.py
  • iris/src/iris/pipeline/shared/global_search_intent_classifier.py
  • iris/src/iris/web/routers/pipelines.py
  • iris/src/iris/web/status/status_update.py
✅ Files skipped from review due to trivial changes (1)
  • iris/models/global_search_intent/onnx/meta.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • iris/src/iris/web/status/status_update.py
  • iris/src/iris/pipeline/global_search_pipeline.py

Comment thread iris/models/global_search_intent/train.py Outdated
Comment thread iris/models/global_search_intent/train.py
Comment thread iris/src/iris/web/routers/pipelines.py
Nayer-kotry and others added 5 commits April 28, 2026 05:34
- train.py: replace dynamic pip install with fail-fast import check; add
  poetry train group (setfit, optimum[onnxruntime] ^2.0, datasets,
  scikit-learn, pandas) — install with: poetry install --with train
- train.py: mirror _find_onnx_file selection logic (sort + prefer quantized)
  when recording onnx_file in meta.json
- pipelines.py: update docstring to document that thinking callback is only
  sent for TRIGGER_AI (LLM) path, not SKIP_AI
Without this, every query would log a warning after the directory-missing
path returned None without setting the instance flag.
vLLM 0.20.0 switched PyPI default to CUDA 13, which required our
TORCH_INDEX_URL fix. Pin to this known-good version to prevent
future surprise breakages from unpinned >=0.19.1 resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Nayer-kotry Nayer-kotry requested review from a team as code owners April 30, 2026 12:44
@github-actions
Copy link
Copy Markdown

Athena Test Results Summary

TestsPassed ✅SkippedFailed
Athena Test Report10 ran10 passed0 skipped0 failed

Failing Tests Summary

TestResult
No test annotations available

@github-actions
Copy link
Copy Markdown

📊 Detailed Coverage Table

Combining 3 coverage files...
Parsing test-results/programming_module_programming_llm_coverage.xml...
Parsing test-results/text_module_text_llm_coverage.xml...
Parsing test-results/modeling_module_modeling_llm_coverage.xml...
Combining duplicate packages...
Creating combined coverage file: test-results/combined_coverage.xml
✅ Combined coverage saved to test-results/combined_coverage.xml
📊 Combined 31 unique packages

📊 Combined Coverage Summary:

Package Line Rate Branch Rate Status
athena 35.7% 1.5%
athena.helpers 100.0% 100.0%
athena.helpers.programming 33.0% 0.0%
athena.helpers.text 0.0% 100.0%
athena.models 0.0% 0.0%
athena.schemas 76.5% 8.3%
athena.storage 17.6% 0.0%
llm_core 100.0% 100.0%
llm_core.core 23.6% 5.7%
llm_core.loaders 79.3% 37.5%
llm_core.loaders.model_loaders 68.5% 37.5%
llm_core.models 66.7% 35.7%
llm_core.models.providers 73.7% 50.0%
llm_core.utils 52.8% 18.5%
modeling.module_modeling_llm.module_modeling_llm 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.apollon_transformer 43.2% 8.8%
modeling.module_modeling_llm.module_modeling_llm.apollon_transformer.parser 79.2% 60.2%
modeling.module_modeling_llm.module_modeling_llm.core 88.9% 50.0%
modeling.module_modeling_llm.module_modeling_llm.models 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.prompts 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.utils 100.0% 50.0%
programming.module_programming_llm.module_programming_llm 100.0% 100.0%
programming.module_programming_llm.module_programming_llm.helpers 27.6% 0.0%
programming.module_programming_llm.module_programming_llm.prompts 100.0% 100.0%
text.module_text_llm.module_text_llm 72.7% 12.5%
text.module_text_llm.module_text_llm.default_approach 66.4% 36.1%
text.module_text_llm.module_text_llm.default_approach.prompts 100.0% 100.0%
text.module_text_llm.module_text_llm.default_approach.schemas 100.0% 100.0%
text.module_text_llm.module_text_llm.divide_and_conquer 34.0% 0.0%
text.module_text_llm.module_text_llm.helpers 55.4% 26.7%
text.module_text_llm.module_text_llm.self_consistency 46.2% 0.0%

Total packages: 31

Note: Coverage thresholds: ✅ (≥70%), ❌ (<70%)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
athena/athena/athena/storage/submission_storage.py (1)

62-63: ⚡ Quick win

Consider adding a debug log for disabled-DB write no-op.

store_submissions silently returns when DB is off; a debug-level breadcrumb here would make misconfiguration diagnosis easier without adding warning noise.

Proposed change
 from athena.contextvars import get_lms_url
 from athena.database import get_db, is_database_enabled
+from athena.logger import logger
 from athena.schemas import Submission
@@
 def store_submissions(submissions: List[Submission], lms_url: Optional[str] = None):
     """Stores the given submissions, all at once."""
     if not is_database_enabled():
+        logger.debug("Database disabled: skipping store_submissions for %d submissions", len(submissions))
         return
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/athena/athena/storage/submission_storage.py` around lines 62 - 63, In
store_submissions, add a debug log right before the early return when
is_database_enabled() is false (i.e., log that DB writes are disabled and the
function is a no-op); use the module logger (or existing logger variable) and
include contextual info such as the number of submissions (e.g.,
len(submissions)) or an identifier to make the breadcrumb useful for debugging
without raising warnings.
iris/models/global_search_intent/train.py (1)

39-40: ⚡ Quick win

Anchor data/artifact paths to the script directory.

Current relative paths depend on the caller’s working directory. Resolving from __file__ makes the script robust in CI and local execution.

💡 Proposed refactor
+SCRIPT_DIR = Path(__file__).resolve().parent
-df = pd.read_csv("models/global_search_intent/training_data.csv")
+df = pd.read_csv(SCRIPT_DIR / "training_data.csv")
...
-BACKBONE_DIR = Path("models/global_search_intent/backbone")
+BACKBONE_DIR = SCRIPT_DIR / "backbone"
...
-ONNX_BASE_DIR = Path("models/global_search_intent/onnx_base")
+ONNX_BASE_DIR = SCRIPT_DIR / "onnx_base"
...
-ONNX_OPT_DIR = Path("models/global_search_intent/onnx_optimized")
+ONNX_OPT_DIR = SCRIPT_DIR / "onnx_optimized"
...
-FINAL_DIR = Path("models/global_search_intent/onnx")
+FINAL_DIR = SCRIPT_DIR / "onnx"

Also applies to: 100-101, 135-136, 143-144, 159-160

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/models/global_search_intent/train.py` around lines 39 - 40, The CSV and
other artifact paths in train.py are relative to the caller's cwd; change them
to be anchored to the script directory by computing a base path from __file__
(e.g., base_dir = Path(__file__).resolve().parent) and replace direct string
literals passed to pd.read_csv (and any writes/loads later) with base_dir /
"models/global_search_intent/training_data.csv" (and the corresponding artifact
filenames referenced at the other occurrences). Update the uses of pd.read_csv
and any file write/load calls so they build paths with Path.joinpath or the /
operator using that base_dir to make df loading and artifact operations robust
in CI and local runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@iris/models/global_search_intent/train.py`:
- Around line 42-43: The f-strings use double quotes both for the string and for
the df["label"] key which raises a syntax error; update the print statements
(the two occurrences around the top prints and the later prints at lines 61–62)
to use a different quoting style for the dictionary access (e.g., df['label'])
or escape the quotes so the f-string delimiter and the key quotes don't
conflict, leaving the rest of the print formatting unchanged.

---

Nitpick comments:
In `@athena/athena/athena/storage/submission_storage.py`:
- Around line 62-63: In store_submissions, add a debug log right before the
early return when is_database_enabled() is false (i.e., log that DB writes are
disabled and the function is a no-op); use the module logger (or existing logger
variable) and include contextual info such as the number of submissions (e.g.,
len(submissions)) or an identifier to make the breadcrumb useful for debugging
without raising warnings.

In `@iris/models/global_search_intent/train.py`:
- Around line 39-40: The CSV and other artifact paths in train.py are relative
to the caller's cwd; change them to be anchored to the script directory by
computing a base path from __file__ (e.g., base_dir =
Path(__file__).resolve().parent) and replace direct string literals passed to
pd.read_csv (and any writes/loads later) with base_dir /
"models/global_search_intent/training_data.csv" (and the corresponding artifact
filenames referenced at the other occurrences). Update the uses of pd.read_csv
and any file write/load calls so they build paths with Path.joinpath or the /
operator using that base_dir to make df loading and artifact operations robust
in CI and local runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ea633a56-c8a0-4162-bd5d-3ec395bba250

📥 Commits

Reviewing files that changed from the base of the PR and between fb35d7a and 21cabd1.

⛔ Files ignored due to path filters (1)
  • iris/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • athena/athena/athena/app.py
  • athena/athena/athena/database.py
  • athena/athena/athena/endpoints.py
  • athena/athena/athena/storage/exercise_storage.py
  • athena/athena/athena/storage/feedback_storage.py
  • athena/athena/athena/storage/structured_grading_criterion_storage.py
  • athena/athena/athena/storage/submission_storage.py
  • athena/env_example/module_programming_llm.env
  • athena/env_example/module_text_llm.env
  • athena/modules/modeling/module_modeling_llm/module_modeling_llm/__main__.py
  • athena/modules/programming/module_programming_llm/module_programming_llm/__main__.py
  • athena/modules/programming/module_programming_quality_llm/module_programming_quality_llm/__main__.py
  • athena/modules/text/module_text_llm/module_text_llm/__main__.py
  • iris/models/global_search_intent/train.py
  • iris/pyproject.toml
  • iris/src/iris/pipeline/shared/global_search_intent_classifier.py
  • iris/src/iris/web/routers/pipelines.py
  • logos/logos-workernode/Dockerfile
✅ Files skipped from review due to trivial changes (3)
  • athena/env_example/module_programming_llm.env
  • logos/logos-workernode/Dockerfile
  • athena/env_example/module_text_llm.env
🚧 Files skipped from review as they are similar to previous changes (2)
  • iris/pyproject.toml
  • iris/src/iris/pipeline/shared/global_search_intent_classifier.py

Comment on lines +42 to +43
print(f" trigger_ai : {df["label"].sum()}")
print(f" skip_ai : {(df["label"] == 0).sum()}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify parser-level correctness after the fix.
python -m py_compile iris/models/global_search_intent/train.py

Repository: ls1intum/edutelligence

Length of output: 254


🏁 Script executed:

cat -n iris/models/global_search_intent/train.py | sed -n '35,70p'

Repository: ls1intum/edutelligence

Length of output: 1763


🏁 Script executed:

rg 'print\(f"[^"]*{[^}]*\["[^"]*"\]' iris/models/global_search_intent/train.py

Repository: ls1intum/edutelligence

Length of output: 195


🏁 Script executed:

rg 'print.*f".*\["' iris/models/global_search_intent/train.py

Repository: ls1intum/edutelligence

Length of output: 280


🏁 Script executed:

# Create a test version with the proposed fix applied
sed -i 's/df\["label"\]/df["'"'"'label'"'"']/g; s/df_eval\["label"\]/df_eval["'"'"'label'"'"']/g' iris/models/global_search_intent/train.py

# Verify the fix resolves the syntax error
python -m py_compile iris/models/global_search_intent/train.py

Repository: ls1intum/edutelligence

Length of output: 300


🏁 Script executed:

# Restore the original file from git
git checkout iris/models/global_search_intent/train.py

Repository: ls1intum/edutelligence

Length of output: 96


🏁 Script executed:

python <<'EOF'
# Read the file
with open('iris/models/global_search_intent/train.py', 'r') as f:
    content = f.read()

# Apply the fix: replace df["label"] with df['label'] and df_eval["label"] with df_eval['label']
# Only within f-strings (print statements)
fixed_content = content.replace('df["label"]', "df['label']").replace('df_eval["label"]', "df_eval['label']")

# Write the fixed content
with open('iris/models/global_search_intent/train.py', 'w') as f:
    f.write(fixed_content)

print("File updated")
EOF

# Verify the fix resolves the syntax error
python -m py_compile iris/models/global_search_intent/train.py

Repository: ls1intum/edutelligence

Length of output: 79


🏁 Script executed:

cat -n iris/models/global_search_intent/train.py | sed -n '40,62p'

Repository: ls1intum/edutelligence

Length of output: 1212


Fix invalid f-string quoting (syntax error prevents execution).

Lines 42–43 and 61–62 use double quotes for both the f-string delimiter and df["label"] dictionary access, which is invalid Python syntax and causes an f-string: unmatched '[' error at parse time.

Proposed fix
-print(f"  trigger_ai   : {df["label"].sum()}")
-print(f"  skip_ai      : {(df["label"] == 0).sum()}")
+print(f"  trigger_ai   : {df['label'].sum()}")
+print(f"  skip_ai      : {(df['label'] == 0).sum()}")
...
-print(f"    trigger_ai : {df_eval["label"].sum()}")
-print(f"    skip_ai    : {(df_eval["label"] == 0).sum()}\n")
+print(f"    trigger_ai : {df_eval['label'].sum()}")
+print(f"    skip_ai    : {(df_eval['label'] == 0).sum()}\n")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(f" trigger_ai : {df["label"].sum()}")
print(f" skip_ai : {(df["label"] == 0).sum()}")
print(f" trigger_ai : {df['label'].sum()}")
print(f" skip_ai : {(df['label'] == 0).sum()}")
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 42-42: SyntaxError: f-string: unmatched '['

(E999)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/models/global_search_intent/train.py` around lines 42 - 43, The
f-strings use double quotes both for the string and for the df["label"] key
which raises a syntax error; update the print statements (the two occurrences
around the top prints and the later prints at lines 61–62) to use a different
quoting style for the dictionary access (e.g., df['label']) or escape the quotes
so the f-string delimiter and the key quotes don't conflict, leaving the rest of
the print formatting unchanged.

@github-actions github-actions Bot removed the athena label May 3, 2026
GlobalSearchPipeline was always instantiated with the default local=False,
causing it to use cloud models regardless of the user's Artemis LLM setting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants