How to move from a continuation model to a chat model? #497

antonkratz · 2024-05-30T06:55:10Z

antonkratz
May 30, 2024

I do not understand, conceptually, how to move from a completion model to a chat model. What @karpathy describes in #481 is, if I understand correctly, similar to BLOOM, is that right? I.e. it is a model that given text, will continue that text until some stop token is reached, is that correct? But how to get from this, or something like BLOOM, to something like ChatGPT? It seems like two almost completely different tasks! One is continuation... the other task is response. Could someone please point me in the right direction how to get from a BLOOM type continuation model to a ChatGPT type model? Again, this is about implementing it on top of #481 but more importantly my question is conceptual.

karpathy · 2024-05-30T13:03:16Z

karpathy
May 30, 2024
Maintainer

It's actually very simple conceptually, you just swap out the dataset and continue training for a little bit. From one that looks like internet documents to one that looks like conversations. There's a number of these available, e.g. https://github.com/LAION-AI/Open-Assistant is one from a while back. This approach is "SFT" (Supervised Finetuning) and gets you a long way. You can then get another ~10% with RLHF (more complicated), or DPO (not complicated, works almost as well).

llm.c will probably get around to it, but the basics have to be super solid, and they are not yet.

1 reply

antonkratz May 30, 2024
Author

Thank you so much @karpathy! I am also watching your zero to hero videos right now and learning tremendously.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to move from a continuation model to a chat model? #497

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

How to move from a continuation model to a chat model? #497

antonkratz May 30, 2024

Replies: 1 comment · 1 reply

karpathy May 30, 2024 Maintainer

antonkratz May 30, 2024 Author

antonkratz
May 30, 2024

Replies: 1 comment 1 reply

karpathy
May 30, 2024
Maintainer

antonkratz May 30, 2024
Author