Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for quantized qwen2-0.5b #44

Merged
merged 1 commit into from
Jun 26, 2024
Merged

Conversation

bil-ash
Copy link
Contributor

@bil-ash bil-ash commented Jun 26, 2024

This PR adds support for quantized(q4f16_1) qwen2-0.5b. Solves issue . PR must be merged before merging this.
@Neet-Nestor

to add support for quantized(q4f16) qwen2-0.5b
@Neet-Nestor
Copy link
Collaborator

Unfortunately, this won't fix the issue since the model Qwen2-0.5B-Instruct-q4f16-MLC you added is not available in WebLLM yet. Instead we need to compile the model, upload to huggingface, update WebLLM, and finally update our app.

@bil-ash
Copy link
Contributor Author

bil-ash commented Jun 26, 2024

Unfortunately, this won't fix the issue since the model Qwen2-0.5B-Instruct-q4f16-MLC you added is not available in WebLLM yet. Instead we need to compile the model, upload to huggingface, update WebLLM, and finally update our app.

I saw that and so I have also created the required PR in the web-llm repo.

@Neet-Nestor
Copy link
Collaborator

Neet-Nestor commented Jun 26, 2024

Thanks! Let me try.

@Neet-Nestor Neet-Nestor reopened this Jun 26, 2024
@Neet-Nestor
Copy link
Collaborator

Thanks! It's working perfectly on my end. I will leave the other 2 PRs for my team to review, but I can merge this one and publish a new version of the webapp so that you can use it immediately.

@Neet-Nestor Neet-Nestor merged commit 0617dbb into mlc-ai:main Jun 26, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants