SWE Agent for solving SWEBench and TerminalBench #559

utsavgarg · 2025-11-05T20:36:21Z

This PR introduces the SWE Agent, a new autonomous agent designed to solve software engineering problems from the SWE-bench and TerminalBench benchmarks.

The agent operates autonomously within a Dockerized environment, it has a extensible toolset providing basic functionality for solving software engineering tasks.

For detailed information on architecture, setup, testing, and evaluation, please refer to the README.md file included in the agent's directory.

tpryan

Couple nits. I am pulling in a python reviewer to look at the actual code.

python/agents/swe-agent/pyproject.toml

python/agents/swe-agent/swe_agent/main.py

python/agents/swe-agent/swe_agent/orchestrator.py

python/agents/swe-agent/swe_agent/swebench_environment.py

python/agents/swe-agent/swe_agent/terminalbench_environment.py

python/agents/swe-agent/.gitignore

python/agents/swe-agent/uv.lock

python/agents/swe-agent/swe_agent/orchestrator.py

happyhuman

1- Please run pylint on all the python files.
2- Please choose a less generic project name other than swe_agent (which could mean anything). Also the agent names use - instead of _.

python/agents/swe-benchmark-agent/swe_benchmark_agent/main.py

python/agents/swe-agent/swe_agent/main.py

python/agents/swe-benchmark-agent/pyproject.toml

python/agents/swe-benchmark-agent/swe_benchmark_agent/orchestrator.py

python/agents/swe-agent/swe_agent/swebench_environment.py

happyhuman · 2025-11-17T19:09:12Z

Thanks for all the changes. Can we rename this sample to something less generic. swe-agent is just too broad and it could mean anything (e.g. Android Developer, Web Developer, etc).

utsavgarg · 2025-11-17T20:57:58Z

Thanks for all the changes. Can we rename this sample to something less generic. swe-agent is just too broad and it could mean anything (e.g. Android Developer, Web Developer, etc).

Thanks for the reviews @happyhuman, I've renamed the agent to swe-benchmark-agent as it demonstrates SWE capabilities through two popular SWE benchmarks.

tpryan reviewed Nov 10, 2025

View reviewed changes

python/agents/swe-agent/pyproject.toml Outdated Show resolved Hide resolved

python/agents/swe-agent/pyproject.toml Outdated Show resolved Hide resolved

happyhuman self-requested a review November 10, 2025 19:27

utsavgarg force-pushed the swe-agent branch from c026a6b to 3def24c Compare November 10, 2025 21:29

happyhuman reviewed Nov 10, 2025

View reviewed changes

python/agents/swe-agent/swe_agent/orchestrator.py Outdated Show resolved Hide resolved

happyhuman reviewed Nov 10, 2025

View reviewed changes

python/agents/swe-benchmark-agent/swe_benchmark_agent/main.py Show resolved Hide resolved

utsavgarg requested review from happyhuman and tpryan November 11, 2025 02:41

happyhuman reviewed Nov 12, 2025

View reviewed changes

utsavgarg force-pushed the swe-agent branch from 1995001 to c876c44 Compare November 12, 2025 23:17

utsavgarg force-pushed the swe-agent branch from c876c44 to 069ef57 Compare November 17, 2025 20:49

initial swe agent

7219b5f

utsavgarg force-pushed the swe-agent branch from 069ef57 to 7219b5f Compare November 17, 2025 20:59

happyhuman approved these changes Nov 17, 2025

View reviewed changes

Merge branch 'main' into swe-agent

0fe25b1

SWE Agent for solving SWEBench and TerminalBench #559

Are you sure you want to change the base?

SWE Agent for solving SWEBench and TerminalBench #559

Conversation

utsavgarg commented Nov 5, 2025

Uh oh!

tpryan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

happyhuman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

happyhuman commented Nov 17, 2025

Uh oh!

utsavgarg commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

utsavgarg commented Nov 17, 2025 •

edited

Loading