vScheduler

A Kubernetes scheduler designed for smart scheduling with llmaz.

Plugins

vScheduler maintains multiple plugins for llm workloads scheduling.

ResourceFungibility Plugin

A llama2-7B model can be run on 1xA100 GPU, can also be run on 1xA10 GPU, this is what we called fungibility.

With resourceFungibility plugin, we can simply achieve this with at most 8 alternative GPU types.