Skip to content

Conversation

@simonrosenberg
Copy link
Collaborator

@simonrosenberg simonrosenberg commented Jan 5, 2026

Summary

  • Fix SWT-Bench eval import path by adding repo src/ to sys.path (unblocks harness)
  • Wire cache/cost/timing improvements for SWT-Bench (preload cache & docker images via eval pipeline)
  • Bump pydantic-core dependency as requested

Testing

  • Evaluation workflow: run 20705519234 (swtbench) completed successfully; produced docker image cache and report

Fixes #234

@openhands-ai
Copy link

openhands-ai bot commented Jan 13, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Pre-commit checks

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #245 at branch `feature/swtbench-latency`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Child: Optimize SWTBench evaluation latency by preloading prebaked Docker base

2 participants