-
Notifications
You must be signed in to change notification settings - Fork 263
Save trajectory after each turn #585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
|
@john-b-yang would you be able to review this? It would be nice to have harbor/terminal-bench not depend on my own fork :) |
|
Will check it out ~2 weeks from now! We have some of our things we're trying to get out. |
|
Hi @li-boxuan sorry for the late response, John & I were busy working on https://codeclash.ai/ Hmm, so so far, saving the trajectory simply wasn't part of the responsibilities of this class. Why don't you just do try:
agent.run()
finally:
save_traj(agent) |
|
(This is already what we do for the CLI of mini) |
That didn't work - at least in terminal-bench, where we kill the agent process if it times out |
|
Hi @li-boxuan I see, I guess if you SIGKILL the process, there's no way any at-exit behavior can work. We never needed that, but I'm sure you have a reason for that. I really want to keep the default class as lean as possible, so I am hesitating, but I also want to support tbench. Would the following be acceptable:
Having your own class would also make it easy to add any future adjustments you might have, I would happily support anything there as long as it doesn't interfere with the rest Let me know if you use python bindings or need the CLI |
|
@klieret Sounds great! We need CLI support (this is how we use mini-swe-agent: https://github.com/laude-institute/harbor/blob/133719fee1fa0a08357b789a30894f008d646e98/src/harbor/agents/installed/mini_swe_agent.py#L398) HarborMiniAgent might be a better name, since we have moved all harness code from original https://github.com/laude-institute/terminal-bench repository to https://github.com/laude-institute/harbor |
|
Forgot to say - feel free to close this PR in favor of yours |
|
Got it, thanks! Let me make this happen today or over the weekend :) I'll let you know here |
|
Gentle ping on this; we are using my fork in https://github.com/laude-institute/harbor/blob/main/src/harbor/agents/installed/install-mini-swe-agent.sh.j2 which means we will miss any great new improvement from mini-swe-agent team. On the other hand we have this workaround so not urgent :) |
Currently mini-swe-agent only saves trajectory after finishing the task. This is not ideal in environments where mini-swe-agent could get killed. In fact, in terminal-bench experiments, we kill agents after they timeout; without saving trajectories after each turn, no trajectory would be persisted.