Skip to content

Commit

Permalink
add text-generation-inference (#465)
Browse files Browse the repository at this point in the history
  • Loading branch information
zhimin-z authored Jul 28, 2024
1 parent 84c3806 commit e602df9
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,7 @@ This repository contains a curated list of awesome open source libraries that wi
* [S-LoRA](https://github.com/S-LoRA/S-LoRA) ![](https://img.shields.io/github/stars/S-LoRA/S-LoRA.svg?style=social) - Serving Thousands of Concurrent LoRA Adapters.
* [Tempo](https://github.com/SeldonIO/tempo) ![](https://img.shields.io/github/stars/SeldonIO/tempo.svg?style=social) - Open source SDK that provides a unified interface to multiple MLOps projects that enable data scientists to deploy and productionise machine learning systems.
* [Tensorflow Serving](https://github.com/tensorflow/serving) ![](https://img.shields.io/github/stars/tensorflow/serving.svg?style=social) - High-performant framework to serve Tensorflow models via grpc protocol able to handle 100k requests per second per core.
* [text-generation-inference](https://github.com/huggingface/text-generation-inference) ![](https://img.shields.io/github/stars/huggingface/text-generation-inference.svg?style=social) - Large Language Model Text Generation Inference.
* [TorchServe](https://github.com/pytorch/serve) ![](https://img.shields.io/github/stars/pytorch/serve.svg?style=social) - TorchServe is a flexible and easy to use tool for serving PyTorch models.
* [Transformer-deploy](https://github.com/ELS-RD/transformer-deploy/) ![](https://img.shields.io/github/stars/ELS-RD/transformer-deploy.svg?style=social) - Transformer-deploy is an efficient, scalable and enterprise-grade CPU/GPU inference server for Hugging Face transformer models.
* [Triton Inference Server](https://github.com/triton-inference-server/server) ![](https://img.shields.io/github/stars/triton-inference-server/server.svg?style=social) - Triton is a high performance open source serving software to deploy AI models from any framework on GPU & CPU while maximizing utilization.
Expand Down

0 comments on commit e602df9

Please sign in to comment.