Skip to content

Releases: ROCm/vllm

v0.6.3.post2+rocm

01 Nov 23:03
733f79a
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.6.3.post1+rocm...v0.6.3.post2+rocm

What's Changed

New Contributors

Full Changelog: v0.4.3_rocm...v0.6.3.post2+rocm

v0.6.3.post1+rocm

29 Oct 21:12
7aa6982
Compare
Choose a tag to compare
v0.6.3.post1+rocm Pre-release
Pre-release

What's Changed

  • Upstream merge 24 10 21 by @gshtras in #240
  • Using the correct datatype on prefix prefill for fp8 kv cache by @gshtras in #242
  • Update CMakeLists.txt by @gshtras in #244
  • update block_manager usage in setup_cython by @saienduri in #243
  • [Bugfix][Kernel][Misc] Basic support for SmoothQuant, symmetric case by @rasmith in #237
  • Add fp8 support for llama model family on Navi4x by @qli88 in #245
  • Custom all reduce fix mi250 by @omirosh in #247
  • Upstream merge 24 10 28 by @gshtras in #248

New Contributors

Full Changelog: v0.6.2.post1+rocm...v0.6.3.post1+rocm

v0.6.2.post1+rocm

23 Oct 00:14
69d5e1d
Compare
Choose a tag to compare
v0.6.2.post1+rocm Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v0.6.2+rocm...v0.6.2.post1+rocm

v0.6.2+rocm

02 Oct 17:29
030374b
Compare
Choose a tag to compare
v0.6.2+rocm Pre-release
Pre-release

What's Changed

Full Changelog: v0.6.1.post1+rocm...v0.6.2+rocm

v0.6.1.post1+rocm

27 Sep 21:48
956b831
Compare
Choose a tag to compare
v0.6.1.post1+rocm Pre-release
Pre-release

What's Changed

Full Changelog: v0.6.1_rocm...v0.6.1.post1+rocm

v0.6.1_rocm

19 Sep 15:16
a67b65b
Compare
Choose a tag to compare
v0.6.1_rocm Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v0.6.0_rocm...v0.6.1_rocm

v0.6.0_rocm

05 Sep 17:10
8032519
Compare
Choose a tag to compare
v0.6.0_rocm Pre-release
Pre-release

What's Changed

Read more

v0.6.0

05 Sep 17:10
32e7db2
Compare
Choose a tag to compare
v0.6.0 Pre-release
Pre-release

Full Changelog: v0.5.5...v0.6.0

v0.4.0

06 Jun 22:04
68cdb95
Compare
Choose a tag to compare

What's Changed

  • Features integration without fp8 by @gshtras in #7
  • Layernorm optimizations by @mawong-amd in #8
  • Bringing in the latest commits from upstream by @mawong-amd in #9
  • Bump Docker to ROCm 6.1, add gradlib for tuned gemm, include RCCL fixes by @mawong-amd in #12
  • add mi300 fused_moe tuned configs by @divakar-amd in #13
  • Correctly calculating the same value for the required cache blocks num for all torchrun processes by @gshtras in #15
  • [ROCm] adding a missing triton autotune config by @hongxiayang in #17
  • make the vllm setup mode configurable and make install mode as defaul… by @hongxiayang in #18
  • enable fused topK_softmax kernel for hip by @divakar-amd in #14
  • Fix ambiguous fma call by @cjatin in #16
  • Rccl dockerfile updates by @mawong-amd in #19
  • Dockerfile improvements: multistage by @mawong-amd in #20
  • Integrate PagedAttention Optimization custom kernel into vLLM by @lcskrishna in #22
  • Updates to custom PagedAttention for supporting context len upto 32k. by @lcskrishna in #25
  • Update max_context_len for custom paged attention. by @lcskrishna in #26
  • Update RCCL, hipBLASLt, base image in Dockerfile.rocm by @shajrawi in #24
  • Adding fp8 gemm computation by @charlifu in #29
  • fix the model loading fp8 by @charlifu in #30
  • Update linear.py by @gshtras in #32
  • Update base docker image with Pytorch 2.3 by @charlifu in #35

New Contributors

Full Changelog: v0.3.3...v0.4.0

v0.3.0

07 Feb 20:10
1af090b
Compare
Choose a tag to compare