Skip to content

Conversation

@juanmichelini
Copy link
Collaborator

Summary

This PR adds a new utility function add_resolve_rate_to_predictions in benchmarks/utils/output_utils.py that enriches a predictions JSONL file with resolution status from a report JSON file.

Changes

New Function: add_resolve_rate_to_predictions

The function takes two file paths:

  1. predictions_path: Path to a JSONL file where each line is a JSON object with an instance_id field
  2. report_path: Path to a JSON file containing resolved_ids and unresolved_ids lists

For each prediction in the JSONL file:

  • If instance_id is in resolved_ids, adds {"report": {"resolved": true}}
  • If instance_id is in unresolved_ids, adds {"report": {"resolved": false}}
  • If instance_id is in neither list, no report field is added

Example

Given predictions:

{"instance_id": "fasterxml/jackson-databind:pr-4469", ...}
{"instance_id": "elastic/logstash:pr-15241", ...}
{"instance_id": "fasterxml/jackson-databind:pr-1234", ...}
{"instance_id": "fasterxml/jackson-databind:pr-2036", ...}

And a report with:

{
    "resolved_ids": ["fasterxml/jackson-databind:pr-2036", ...],
    "unresolved_ids": ["fasterxml/jackson-databind:pr-4469", "elastic/logstash:pr-15241", ...]
}

The predictions file is updated to:

{"instance_id": "fasterxml/jackson-databind:pr-4469", ..., "report": {"resolved": false}}
{"instance_id": "elastic/logstash:pr-15241", ..., "report": {"resolved": false}}
{"instance_id": "fasterxml/jackson-databind:pr-1234", ...}
{"instance_id": "fasterxml/jackson-databind:pr-2036", ..., "report": {"resolved": true}}

Note: pr-1234 is not updated because it's not in either list.

Tests

Added comprehensive tests in tests/test_output_utils.py covering:

  • Basic functionality with resolved and unresolved instances
  • Preservation of existing fields in predictions
  • Empty predictions file handling
  • Empty report lists handling
  • String path support
  • Missing keys in report handling

Fixes #195

@juanmichelini can click here to continue refining the PR

This function enriches a predictions JSONL file with resolution status
from a report JSON file. For each prediction:
- If instance_id is in resolved_ids, adds {"report": {"resolved": true}}
- If instance_id is in unresolved_ids, adds {"report": {"resolved": false}}
- If instance_id is in neither list, no report field is added

Fixes #195

Co-authored-by: openhands <[email protected]>
@juanmichelini
Copy link
Collaborator Author

@simonrosenberg @xingyaoww leaving this one as a draft for the moment.
Don't want to rush anything that overwrites the output.jsonl, since we might lose inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ALL: enrich output.jsonl from output.report.jsonl

3 participants