-
Notifications
You must be signed in to change notification settings - Fork 502
feat: Add RemyxCodeExecutor for research paper execution #2141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
23f15a8 to
683287c
Compare
🤖 Code Review - PR #2141: RemyxCodeExecutorSummaryThis PR adds ✅ Strengths
🔍 Issues & RecommendationsHigh Priority1. API Key Inconsistency (autogen/coding/remyx_code_executor.py:127)self.api_key = api_key or os.getenv("REMYX_API_KEY")Issue: The code checks for Impact: Users may be confused about which environment variable to use, leading to authentication failures. Recommendation: self.api_key = api_key or os.getenv("REMYX_API_KEY") or os.getenv("REMYXAI_API_KEY")2. API Key Not Used (autogen/coding/remyx_code_executor.py:119-134)Issue: The Recommendation: Either:
3. Missing Cleanup in
|
priyansh4320
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @salma-remyx can you please fix the pre-commit check
Got it! Let me know if there's anything else to add/adjust |
| if verbose: | ||
| print("=" * 80) | ||
| print("🔬 Interactive Research Exploration Session") | ||
| print("=" * 80) | ||
| print(f"📄 Paper: {self.arxiv_id or 'Custom image'}") | ||
|
|
||
| if interactive: | ||
| print("\n💬 INTERACTIVE MODE") | ||
| print(" - Press ENTER to continue") | ||
| print(" - Type guidance/questions") | ||
| print(" - Type 'exit' to end") | ||
| else: | ||
| print("\n🤖 AUTOMATED MODE") | ||
|
|
||
| print("=" * 80) | ||
| print() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a logger.log instead of prints, can you please add this change
|
|
||
| __all__.extend(["RemyxCodeExecutor", "RemyxCodeResult"]) | ||
| except ImportError: | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's log an error message before passing
| # Default exploration goal | ||
| default_goal = """Perform an interactive exploration of this research paper: | ||
| **Phase 1: Understanding** (2-3 turns) | ||
| 1. Examine the directory structure | ||
| 2. Read README and identify key files | ||
| 3. Understand the paper's implementation | ||
| **Phase 2: Experimentation** (3-5 turns) | ||
| 4. Run a minimal working example | ||
| 5. Experiment with different parameters | ||
| 6. Generate visualizations if applicable | ||
| **Phase 3: Analysis** (2-3 turns) | ||
| 7. Explain key implementation details | ||
| 8. Answer any questions about the code | ||
| 9. Suggest possible modifications/experiments | ||
| Work step-by-step. Wait for human guidance between phases. | ||
| Type TERMINATE when exploration is complete.""" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep this as default for goal variable instead of creating a new var and adding up to memory.
| # Create system message for writer agent | ||
| system_message = f"""{paper_context} | ||
| **Your Mission:** | ||
| {goal or default_goal} | ||
| **Important Guidelines:** | ||
| - Repository is at /app with all dependencies installed | ||
| - Execute ONE command at a time - don't rush | ||
| - Use absolute paths starting with /app | ||
| - Be conversational and explain your actions | ||
| - If you encounter errors, debug step-by-step | ||
| - Wait for human feedback before major actions (if interactive mode) | ||
| - Focus on lightweight demos unless instructed otherwise | ||
| - You can install additional packages if needed | ||
| **What You Can Do:** | ||
| ✓ Read and analyze code | ||
| ✓ Execute Python/bash commands | ||
| ✓ Modify code for experiments | ||
| ✓ Generate plots and visualizations | ||
| ✓ Install additional dependencies | ||
| ✓ Answer questions about implementation | ||
| ✓ Suggest improvements or experiments | ||
| Begin by exploring the repository structure to understand what's available.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's give a user the power to make edits in system messages.
this will help in dealing with clients such as Ollama where people use small size models.
this will also improve UX with remyx for Dev where system message will can cator to a specific domain.
add a new param: system_message:
which will add up to system_message = f"""{paper_context}
Your Mission:
{goal or default_goal}. {system_message}"""
| if verbose: | ||
| print("\n" + "=" * 80) | ||
| print("✅ Exploration Complete!") | ||
| print("=" * 80) | ||
| print("\n📊 Session Summary:") | ||
| print(f" • Total messages: {len(result.chat_history)}") | ||
| print(f" • Cost: ${result.cost['usage_including_cached_inference']['total_cost']:.4f}") | ||
|
|
||
| if result.summary: | ||
| print("\n💬 Final Status:") | ||
| # Print first 200 chars of summary | ||
| summary_preview = result.summary[:200] + "..." if len(result.summary) > 200 else result.summary | ||
| print(f" {summary_preview}") | ||
|
|
||
| print("\n💾 Full chat history available in returned object") | ||
| print(" Access with: result.chat_history") | ||
| print("=" * 80) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here replace print with logging
| # Create system message | ||
| system_message = f"""{paper_context} | ||
| **Your Mission:** | ||
| {goal or default_goal} | ||
| **Guidelines:** | ||
| - Repository is at /app with all dependencies installed | ||
| - Execute ONE command at a time | ||
| - Use absolute paths starting with /app | ||
| - Be conversational and explain your actions | ||
| - Debug step-by-step if errors occur | ||
| Begin by exploring the repository structure.""" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another Idea I have about the UX, let's keep agent system_messages editable by devs.
where user will can have more control over LLM and can do prompt grounding/ or expected output examples to the agent/ or some custom guidelines. let's have defaults with editables system_messages.
| @staticmethod | ||
| def format_chat_result(result: Any) -> str: | ||
| """ | ||
| Format a ChatResult object into a readable summary. | ||
| Useful for displaying exploration results later. | ||
| Args: | ||
| result: The ChatResult object from explore() or initiate_chat() | ||
| Returns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is a utility method let's move it to a utils file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the content looks awesome, let's edit this according to reviewed changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a groupchat test with RemyxCodeExecutor for this integration tests. assert content of the result with re.search("content",generated_content) i.e. regular expressions
|
Hi @salma-remyx , Can you please address the reviewed changes. |
🔄 Re-Review - PR #2141: RemyxCodeExecutorReview StatusThis is a re-review based on the previous Claude Code review from October 30, 2025. I've analyzed the current state of the code to determine which issues have been addressed and which remain outstanding. ✅ What Has Been Addressed1. API Key Environment Variable Consistency ✅ FIXEDPrevious Issue: Line 127 used REMYXAI_API_KEY but integration tests referenced inconsistent variable names. Current Status: ✅ RESOLVED - The code now consistently uses REMYXAI_API_KEY throughout:
This is now consistent and well-documented.
|
Codecov Report❌ Patch coverage is
... and 19 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Why are these changes needed?
This PR adds
RemyxCodeExecutor, enabling AG2 agents to execute code from research papers in pre-configured Docker environments. This addresses a critical pain point in AI research: reproducing experimental results from papers typically requires hours of environment setup, dependency resolution, and CUDA version matching.Problem:
RemyxCodeExecutor provides:
explore()method where agents explain and experiment with paper codeThis follows the pattern established by YepCodeCodeExecutor (#1982), extending AG2's code execution capabilities to research-specific workflows.
Checks
code-execution.mdxwith usage examplesOptional-Dependencies.mdxagentchat_remyx_executor.ipynbwith real examplestest_remyx_executor.pytest_remyx_executor_integration.py(real API and Docker execution)