-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add llama-cpp-python server #452
base: main
Are you sure you want to change the base?
Conversation
Reviewer's Guide by SourceryThis PR changes the default runtime from 'llama.cpp' to 'llama-cpp-python' and adds support for the 'llama-cpp-python' server implementation. The changes involve modifying the server execution logic and updating the CLI configuration to accommodate the new runtime option. Sequence diagram for server execution logicsequenceDiagram
participant User
participant CLI
participant Model
User->>CLI: Run command with --runtime flag
CLI->>Model: Pass runtime argument
alt Runtime is vllm
Model->>CLI: Execute vllm server
else Runtime is llama.cpp
Model->>CLI: Execute llama-server
else Runtime is llama-cpp-python
Model->>CLI: Execute llama_cpp.server
end
Updated class diagram for runtime configurationclassDiagram
class CLI {
-runtime: String
+configure_arguments(parser)
}
class Model {
+serve(args)
}
CLI --> Model: uses
note for CLI "Updated default runtime to 'llama-cpp-python' and added it as a choice"
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ericcurtin - I've reviewed your changes - here's some feedback:
Overall Comments:
- Please provide justification for changing the default runtime from llama.cpp to llama-cpp-python. What are the benefits that led to this decision?
- Consider updating any additional documentation to reflect the new runtime option and explain the differences between the available runtimes
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have skipped reviewing this pull request. It looks like we've already reviewed the commit 47fdf9a in this pull request.
a401718
to
c4f91bc
Compare
Changed default runtime from 'llama.cpp' to 'llama-cpp-python'. Added 'llama-cpp-python' as a runtime option for better flexibility with the `--runtime` flag. Signed-off-by: Eric Curtin <[email protected]>
c4f91bc
to
fe9c0ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the code.
But, I think I'm missing something. Is llama-cpp-python
already a dependency?
@ygalblum we probably need to push some container images before merging this, but when we do that, we should be all good. |
Changed default runtime from 'llama.cpp' to 'llama-cpp-python'. Added 'llama-cpp-python' as a runtime option for better flexibility with the
--runtime
flag.Summary by Sourcery
Add 'llama-cpp-python' as a new runtime option and set it as the default runtime, enhancing flexibility in model serving.
New Features:
Enhancements: