Skip to content

Latest commit

 

History

History
22 lines (20 loc) · 3.26 KB

benchmarks.md

File metadata and controls

22 lines (20 loc) · 3.26 KB

Available benchmarks

The following table gives you an overview of the benchmarks in MTEB.

Name # Tasks Task Types Domains Languages
CoIR 10 {'Retrieval': 10} [Written, Programming] python,c++,sql,go,eng,php,javascript,ruby,java
MINERSBitextMining 7 {'BitextMining': 7} [Written, Social, Reviews] sun,kaz,tzl,ido,abs,arq,yue,tam,nij,glg,slk,hsb,ber,xho,cbk,pol,uzb,ina,kab,swh,amh,fao,kzj,lfn,uig,sqi,deu,ang,ind,bug,pms,ibo,cym,eus,spa,ceb,tgl,ron,isl,ita,csb,cha,fin,est,pes,jpn,tel,tha,oci,cmn,min,fry,bbc,epo,lit,rus,bos,hrv,war,ara,bjn,mkd,srp,ast,nno,urd,pam,aze,eng,ace,bew,kor,dan,awa,mui,hye,ban,cor,ben,gle,swe,mad,bul,lat,cat,nob,fra,pcm,ell,mar,vie,tat,ukr,gsw,kat,arz,dsb,lvs,nld,tur,bel,max,nds,afr,khm,dtp,yor,ces,gla,zsm,mak,ile,nov,orv,bre,swg,rej,mhr,mon,mal,jav,heb,slv,bhp,kur,wuu,tuk,por,hun,hin,hau,yid
MTEB(Retrieval w/Instructions) 3 {'InstructionRetrieval': 3} [Written, News] eng
MTEB(Scandinavian) 28 {'BitextMining': 2, 'Classification': 13, 'Retrieval': 7, 'Clustering': 6} [Encyclopaedic, Spoken, Non-fiction, Government, News, Fiction, Social, Blog, Reviews, Written, Web, Legal] nob,fao,swe,isl,dan,nno
MTEB(code) 12 {'Retrieval': 12} [Written, Programming] python,c++,sql,c,go,eng,shell,typescript,php,scala,rust,swift,javascript,ruby,java
MTEB(deu) 19 {'Classification': 6, 'Clustering': 4, 'PairClassification': 2, 'Reranking': 1, 'Retrieval': 4, 'STS': 2} [Encyclopaedic, Spoken, News, Reviews, Written, Web] eng,deu,pol,fra
MTEB(eng) 67 {'Classification': 12, 'Retrieval': 26, 'Clustering': 11, 'Reranking': 4, 'STS': 10, 'PairClassification': 3, 'Summarization': 1} [Encyclopaedic, Spoken, Non-fiction, Blog, News, Medical, Social, Programming, Written, Reviews, Web, Academic] tur,fra,eng,cmn,pol,ita,nld,spa,deu,ara
MTEB(fra) 26 {'Classification': 6, 'Clustering': 7, 'PairClassification': 2, 'Reranking': 2, 'Retrieval': 5, 'STS': 3, 'Summarization': 1} [Encyclopaedic, Spoken, Non-fiction, News, Social, Reviews, Written, Web, Legal, Academic] eng,deu,pol,fra
MTEB(kor) 6 {'Classification': 1, 'Reranking': 1, 'Retrieval': 2, 'STS': 2} [Encyclopaedic, Spoken, News, Reviews, Written, Web] kor
MTEB(law) 8 {'Retrieval': 8} [Written, Legal] eng,deu,zho
MTEB(pol) 18 {'Classification': 7, 'Clustering': 3, 'PairClassification': 4, 'STS': 4} [Spoken, Non-fiction, News, Fiction, Social, Written, Web, Legal, Academic] pol,deu,eng,fra
MTEB(rus) 23 {'Classification': 9, 'Clustering': 3, 'MultilabelClassification': 2, 'PairClassification': 1, 'Reranking': 2, 'Retrieval': 3, 'STS': 3} [Encyclopaedic, Spoken, Blog, News, Social, Reviews, Written, Web, Academic] rus