-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: (1) Add StatcanDialogueDatasetRetrieval
(2) Fix DRESModel.encode_conversations
to allow list of dictionaries
#779
Conversation
@orionw @vaibhavad would love if you can review this PR! There's a few changes i want to add but feel free to leave feedback now if you wish! |
The current code looks like a great start -- the remainder todos are just filling in the things marked TBD and adding results/points. |
mteb/tasks/Retrieval/multilingual/StatcanDialogueDatasetRetrieval.py
Outdated
Show resolved
Hide resolved
mteb/tasks/Retrieval/multilingual/StatcanDialogueDatasetRetrieval.py
Outdated
Show resolved
Hide resolved
mteb/tasks/Retrieval/multilingual/StatcanDialogueDatasetRetrieval.py
Outdated
Show resolved
Hide resolved
Seems like language filtering does not work, or i'm running the wrong command: from mteb import MTEB
evaluation = MTEB(task_langs=["de"])
print(evaluation.available_tasks) # StatcanDialogueDatasetRetrieval will appear |
I don't think this is the correct API. It is show all tasks, regardless of category, language etc. Can you try |
Thanks, seems print_selected_tasks works. |
…s, alongside lists of strings (e.g. Topicoqa). Also fix batch_size parameter in encode_conversations
@vaibhavad @orionw As discussed in this comment, a fixed was needed for Let me know if this fix makes sense (and whether I should tag this PR as bug fix or something else). Let me know if this change to |
StatcanDialogueDatasetRetrieval
(2) Fix DRESModel.encode_conversations
to allow list of dictionaries
StatcanDialogueDatasetRetrieval
(2) Fix DRESModel.encode_conversations
to allow list of dictionariesStatcanDialogueDatasetRetrieval
(2) Fix DRESModel.encode_conversations
to allow list of dictionaries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have some minor comments, but I like the changes.
No need to tag the PR (not sure we use that?) but definitely add points for the dataset, 1-2 for a bugfix, and also the reviewers.
description="A Dataset for Retrieving Data Tables through Conversations with Genuine Intents, available in English and French.", | ||
dataset={ | ||
"path": "McGill-NLP/statcan-dialogue-dataset-retrieval", | ||
"revision": "v1.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the git commit hash of the dataset, if possible.
Huh, seems tests failed after i merged changes from |
I think those are fixed by #803 if you merge in main again, agree they weren't caused by this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the tests pass and @vaibhavad approves, will enable automerge
@vaibhavad can you approve if everything has been covered? |
Approved and merged, thanks for the great work @xhluca! |
This pull request introduces two changes:
StatcanDialogueDatasetRetrieval
DRESModel.encode_conversations
to allow conversations composed of list of dictionaries, alongside lists of strings (e.g. Topicoqa). Also fixbatch_size
parameter inDRESModel.encode_conversations
(b5141e8)Checklist for adding MMTEB dataset
Reason for dataset addition:
mteb
package.mteb -m {model_name} -t {task_name}
command.sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
intfloat/multilingual-e5-small
self.stratified_subsampling() under dataset_transform()
make test
.make lint
.438.jsonl
).Other
StatcanDialogueDatasetRetrieval
(2) FixDRESModel.encode_conversations
to allow list of dictionaries #779 (comment)StatcanDialogueDatasetRetrieval
(2) FixDRESModel.encode_conversations
to allow list of dictionaries #779 (comment)evaluation = MTEB(task_langs=["en", "de"]) # Only select datasets which are "en", "de" or "en-de"
Notes
To test it, I have used the following code:
Looking at the scores (recall at k):
we can see that the recall at k from 1 to 1000 consistently, which means it is a fairly challenging task (r@10 is only 0.15) but non-random.