Add add_resolve_rate_to_predictions function to output_utils #199
+302
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a new utility function
add_resolve_rate_to_predictionsinbenchmarks/utils/output_utils.pythat enriches a predictions JSONL file with resolution status from a report JSON file.Changes
New Function:
add_resolve_rate_to_predictionsThe function takes two file paths:
instance_idfieldresolved_idsandunresolved_idslistsFor each prediction in the JSONL file:
instance_idis inresolved_ids, adds{"report": {"resolved": true}}instance_idis inunresolved_ids, adds{"report": {"resolved": false}}instance_idis in neither list, noreportfield is addedExample
Given predictions:
{"instance_id": "fasterxml/jackson-databind:pr-4469", ...} {"instance_id": "elastic/logstash:pr-15241", ...} {"instance_id": "fasterxml/jackson-databind:pr-1234", ...} {"instance_id": "fasterxml/jackson-databind:pr-2036", ...}And a report with:
{ "resolved_ids": ["fasterxml/jackson-databind:pr-2036", ...], "unresolved_ids": ["fasterxml/jackson-databind:pr-4469", "elastic/logstash:pr-15241", ...] }The predictions file is updated to:
{"instance_id": "fasterxml/jackson-databind:pr-4469", ..., "report": {"resolved": false}} {"instance_id": "elastic/logstash:pr-15241", ..., "report": {"resolved": false}} {"instance_id": "fasterxml/jackson-databind:pr-1234", ...} {"instance_id": "fasterxml/jackson-databind:pr-2036", ..., "report": {"resolved": true}}Note:
pr-1234is not updated because it's not in either list.Tests
Added comprehensive tests in
tests/test_output_utils.pycovering:Fixes #195
@juanmichelini can click here to continue refining the PR