Skip to content

Conversation

@simonrosenberg
Copy link
Collaborator

Summary

  • Enable Multi-SWE-bench builds in the SWE-Bench build workflow.
  • Support ByteDance-Seed/Multi-SWE-bench dataset formatting for builds.
  • Ensure image builds push to GHCR so remote runs can pull.

Validation

  • software-agent-sdk run-eval.yml run: 20567580750
  • evaluation workflow run: 20567583080
  • benchmarks build workflow run: 20567611098

Artifacts

  • gs://openhands-evaluation-results/eval-20567583080-claude-son_litellm_proxy-claude-sonnet-4-5-20250929_25-12-29-08-15.tar.gz
  • gs://openhands-evaluation-results/artifacts/eval-20567583080-claude-son/

@openhands-ai
Copy link

openhands-ai bot commented Dec 29, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Pre-commit checks

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #219 at branch `add-multiswebench-to-ci`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants