Skip to content

Commit 8d804f4

Browse files
KennethEnevoldsenimenelydiaker
andauthoredApr 3, 2024
fix: Added tests for checking datasets (embeddings-benchmark#307)
* fix: Fixed hf_hub_name for WikiCitiesClustering * Added points for this PR and a 3 other minor dataset fixes * feat: Added tests which validated that datasets are available * fix: Updated hf references and revisions to multiple datasets * Added points for submission * fix: Added suggestions from the review * Apply suggestions from code review Co-authored-by: Imene Kerboua <[email protected]> * fix: sped up async test for whether datasets exist * fix: Updated revisions * fix: reuploaded scandeval datasets * fix: Applied formatter --------- Co-authored-by: Imene Kerboua <[email protected]>
1 parent 74f33f0 commit 8d804f4

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+212
-84
lines changed
 

‎docs/mmteb/points.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
| GitHub | Total points | New dataset | New task | Dataset annotations | (Bug)fixes | Running Models | Review PR | Paper Writing | Ideation | Coordination |
44
| ----------------- | ------------ | ----------- | -------- | ------------------- | ---------- | -------------- | -------- | -------------- | -------- | ------------- |
5-
| KennethEnevoldsen | | 38 | | | 4 | | | | | |
5+
| KennethEnevoldsen | | 38 | | | 8 | | | | | |
66
| x-tabdeveloping | | 2 | | | | | | | | |
77

88
Note that coordination and ideation is not included in the total points, but is used to determine first and last authors.

‎mteb/abstasks/TaskMetadata.py

+6-3
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99
BeforeValidator,
1010
TypeAdapter,
1111
field_validator,
12-
model_validator,
1312
)
1413
from typing_extensions import Annotated, Literal
1514

@@ -172,7 +171,11 @@ def _check_dataset_path_is_specified(cls, dataset):
172171
@field_validator("dataset")
173172
def _check_dataset_revision_is_specified(cls, dataset):
174173
if "revision" not in dataset:
175-
raise ValueError("You must explicitly specify a revision for the dataset (either a SHA or None).")
174+
raise ValueError(
175+
"You must explicitly specify a revision for the dataset (either a SHA or None)."
176+
)
176177
if dataset["revision"] is None:
177-
logging.warning("It is encourage to specify a dataset revision for reproducability")
178+
logging.warning(
179+
"It is encourage to specify a dataset revision for reproducability"
180+
)
178181
return dataset

0 commit comments

Comments
 (0)