Skip to content

Pull requests: OpenHands/benchmarks

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add swebenchmultimodal support to CI workflows
#289 opened Jan 9, 2026 by juanmichelini Loading…
Persist conversation events to runtime
#285 opened Jan 8, 2026 by simonrosenberg Loading…
build(deps): bump the version-all group across 1 directory with 15 updates dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code
#278 opened Jan 7, 2026 by dependabot bot Loading…
Sr/run 500 image job
#269 opened Jan 7, 2026 by simonrosenberg Draft
Require main and critic outputs
#252 opened Jan 6, 2026 by simonrosenberg Loading…
[DRAFT] latest main build-swebench Build 500 SWE-Bench Verified Image based on SDK version on this PR.
#248 opened Jan 5, 2026 by xingyaoww Draft
Add output_jsonl_gcs input forwarding
#237 opened Jan 3, 2026 by simonrosenberg Loading…
Add OpenAgentSafety to eval CI
#221 opened Dec 29, 2025 by simonrosenberg Loading…
Add Multi-SWE-bench image build support
#219 opened Dec 29, 2025 by simonrosenberg Loading…
Agentic code search
#141 opened Dec 8, 2025 by adityasoni9998 Loading…
API-based Critic implementation build-swebench-200 Build 200 SWE-Bench Verified Image based on SDK version on this PR.
#117 opened Nov 26, 2025 by xingyaoww Draft
ProTip! Updated in the last three days: updated:>2026-01-06.