MTEB Results l3_supercat, BS512

Llama3-based models improve over l2 supercat in clustering, reranking and classification tasks, while performing slightly worse in pair classification and similarity tasks. The larger vocabulary is also 4x larger on disk.

Supercat models are trained from concatenated token embeddings from several models using the same tokenizer (eg Llama 3.1 405B, Llama3 70B, Llama guard 2)

Metric	WL64	WL128	WL256	WL512	WL1024	GloVe 300d	Komninos	all-MiniLM-L6-v2
Clustering	33.00	35.09	35.91	36.16	36.48	27.73	26.57	42.35
Reranking	51.25	52.33	52.68	52.82	52.86	43.29	44.75	58.04
Classification	53.37	56.66	58.72	59.65	59.92	57.29	57.65	63.05
Pair Classification	75.68	77.17	77.81	77.94	78.07	70.92	72.94	82.37
STS	65.65	66.51	66.93	67.20	67.15	61.85	62.46	78.90
CQA DupStack	18.67	22.51	24.16	24.92	25.10	15.47	16.79	41.32
SummEval	30.11	30.08	30.11	30.88	29.96	28.87	30.49	30.81

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RESULTS.md

RESULTS.md

MTEB Results l3_supercat, BS512

Files

RESULTS.md

Latest commit

History

RESULTS.md

File metadata and controls

MTEB Results l3_supercat, BS512