Convert HALClustering to Fast #817

AlexeyVatolin · 2024-05-24T18:34:10Z

To solve issue #814

Checklist for adding MMTEB dataset

Reason for dataset addition:

isaac-chung

Thanks for adding this. Please add review points and it's good to merge!

AlexeyVatolin · 2024-05-24T21:28:15Z

I'll write it down for the story. To get information about the date range in the dataset, I parsed again all articles from https://hal.science/<hal_id> and took there the date of publication of the article. Hal_id is a column from dataset

isaac-chung

Thanks again. Just a small comment I missed last time.

mteb/tasks/Clustering/fra/HALClusteringS2S.py

isaac-chung

task names should follow the new name as well.

results/sentence-transformers__paraphrase-multilingual-MiniLM-L12-v2/HALClusteringS2S.v2.json

results/intfloat__multilingual-e5-small/HALClusteringS2S.v2.json

AlexeyVatolin force-pushed the convert_hal_clustering_to_fast branch from 0f2fecb to a5f86e0 Compare May 24, 2024 18:39

Convert HALClustering to Fast

589a80e

AlexeyVatolin force-pushed the convert_hal_clustering_to_fast branch from a5f86e0 to 589a80e Compare May 24, 2024 18:40

isaac-chung approved these changes May 24, 2024

View reviewed changes

isaac-chung self-assigned this May 24, 2024

Add review points

95bf457

Add superseeded_by

04ae6a4

isaac-chung reviewed May 25, 2024

View reviewed changes

mteb/tasks/Clustering/fra/HALClusteringS2S.py Outdated Show resolved Hide resolved

imenelydiaker reviewed May 25, 2024

View reviewed changes

mteb/tasks/Clustering/fra/HALClusteringS2S.py Outdated Show resolved Hide resolved

imenelydiaker reviewed May 25, 2024

View reviewed changes

mteb/tasks/Clustering/fra/HALClusteringS2S.py Outdated Show resolved Hide resolved

AlexeyVatolin added 2 commits May 25, 2024 12:58

Rename task

546cd0a

Add review points

1bba09b

isaac-chung reviewed May 25, 2024

View reviewed changes

results/sentence-transformers__paraphrase-multilingual-MiniLM-L12-v2/HALClusteringS2S.v2.json Outdated Show resolved Hide resolved

results/intfloat__multilingual-e5-small/HALClusteringS2S.v2.json Outdated Show resolved Hide resolved

Apply suggestions from code review

12d0797

isaac-chung enabled auto-merge (squash) May 25, 2024 14:47

isaac-chung merged commit c7adcd8 into embeddings-benchmark:main May 25, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert HALClustering to Fast #817

Convert HALClustering to Fast #817

AlexeyVatolin commented May 24, 2024

isaac-chung left a comment

AlexeyVatolin commented May 24, 2024

isaac-chung left a comment

isaac-chung left a comment

Convert HALClustering to Fast #817

Convert HALClustering to Fast #817

Conversation

AlexeyVatolin commented May 24, 2024

Checklist for adding MMTEB dataset

isaac-chung left a comment

Choose a reason for hiding this comment

AlexeyVatolin commented May 24, 2024

isaac-chung left a comment

Choose a reason for hiding this comment

isaac-chung left a comment

Choose a reason for hiding this comment