Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple generations (sequential) per question #2317

Open
IntrepidEnki opened this issue Sep 17, 2024 · 1 comment
Open

Multiple generations (sequential) per question #2317

IntrepidEnki opened this issue Sep 17, 2024 · 1 comment
Labels
asking questions For asking for clarification / support on library usage. feature request A feature that isn't implemented yet.

Comments

@IntrepidEnki
Copy link

Hi lm-eval maintainers! I'm relatively new to using this library, so any pointers will be greatly appreciated.

I am trying to elicit two sequential answers from the model for each question. I want to do this by providing the question, recording an initial answer, then appending that answer to the original question and using that as a second prompt, and recording the second (and final) answer to do some post-processing.

After reading through the task_guide and new_task_guide I did not see anything directly related to my endeavor - is there a way to setup this workflow by modifying the relevant yaml config file?

Or is the preferred method to call the log parser from a custom filter function?

Thank you

@baberabb
Copy link
Contributor

baberabb commented Sep 18, 2024

Hi! currently the way to do this is a bit involved and will require two lm_eval calls (and two task yamls):

  1. The first using --predict_only which logs the per sample generations (and docs) to file without computing metrics.
  2. The second yaml which parses that file to a HF dataset using something like:
dataset_path: json
dataset_kwargs:
  data_files: /test.jsonl

and then you should be able to structure your task as normal (the resps and filtered_resps fields hold the model outputs).

We are currently looking at ways of supporting lm judge-like tasks, but its very much a work in progress.

@baberabb baberabb added feature request A feature that isn't implemented yet. asking questions For asking for clarification / support on library usage. labels Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
asking questions For asking for clarification / support on library usage. feature request A feature that isn't implemented yet.
Projects
None yet
Development

No branches or pull requests

2 participants