Skip to content

Athena: Implement verification tests for basic approach of module_text_llm#113

Merged
maximiliansoelch merged 32 commits into
mainfrom
feature/verification-tests/step-2
May 30, 2025
Merged

Athena: Implement verification tests for basic approach of module_text_llm#113
maximiliansoelch merged 32 commits into
mainfrom
feature/verification-tests/step-2

Conversation

@alekspetrov9e
Copy link
Copy Markdown
Contributor

@alekspetrov9e alekspetrov9e commented May 4, 2025

Motivation and Context

This PR improves the test coverage for the text feedback generation system by adding test cases that verify the LLM's ability to detect and provide feedback for simple text exercises. It is a stacked PR that extends the mock tests for the same module. Mock tests are executed in the pipeline and they pass. Real tests can be executed locally from module_text_llm dir and with using the venv of module_text_llm.

Description

Added three test cases that verify feedback generation for different types of programming exercises:

  • a recursive factorial function missing a base case and proper decrement
  • a string reversal function incorrectly using built-in methods
  • a list deduplication function that doesn't preserve order

Steps for Testing

Testserver States

Note

These badges show the state of the test servers.
Green = Currently available, Red = Currently locked
Click on the badges to get to the test servers.


Screenshots

@alekspetrov9e alekspetrov9e changed the title Athena: Implemented verification tests for module_text_llm Athena: Implemented verification tests for basic approach of the text module May 4, 2025
@alekspetrov9e alekspetrov9e changed the title Athena: Implemented verification tests for basic approach of the text module Athena: Implemented verification tests for basic approach of the text module May 4, 2025
@github-actions github-actions Bot removed the github label May 12, 2025
Copy link
Copy Markdown
Member

@maximiliansoelch maximiliansoelch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running the tests locally fails.
The GH test action only succeeds as it does not execute any tests.
Please update the GH action file accordingly. It would be also great to have the action log out the number of executed tests including passed and failed tests.

Comment thread athena/tests/modules/text/module_text_llm/real/test_basic_approach_real.py Outdated
Comment thread athena/tests/modules/text/module_text_llm/real/conftest.py
@maximiliansoelch maximiliansoelch requested review from a team and removed request for a team May 30, 2025 11:20
Copy link
Copy Markdown
Member

@maximiliansoelch maximiliansoelch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds real E2E tests against OpenAI, which can be run locally.

There are multiple improvements for future PRs:

  • test files should also be included in the lint config, right now all test files have a lot of lint issues
  • Instead of the unintuitive working directory changes in the test_modules.py script (see review comment), we should figure out a way to properly setup the tests

Comment thread athena/scripts/test_modules.py
@maximiliansoelch maximiliansoelch changed the title Athena: Implemented verification tests for basic approach of the text module Athena: Implement verification tests for basic approach of module_text_llm May 30, 2025
@maximiliansoelch maximiliansoelch merged commit 36b57ff into main May 30, 2025
19 checks passed
@maximiliansoelch maximiliansoelch deleted the feature/verification-tests/step-2 branch May 30, 2025 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants