Skip to content

chore(deps-dev): update vllm requirement from >=0.20.0 to >=0.20.1#417

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/vllm-gte-0.20.1
Open

chore(deps-dev): update vllm requirement from >=0.20.0 to >=0.20.1#417
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/vllm-gte-0.20.1

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github May 6, 2026

Updates the requirements on vllm to permit the latest version.

Release notes

Sourced from vllm's releases.

v0.20.1

vLLM v0.20.1

This is a patch release on top of v0.20.0 primarily focused on DeepSeek V4 stabilization and performance improvements, along with several important bug fixes.

DeepSeek V4

  • Base model support (#41006).
  • Multi-stream pre-attention GEMM (#41061), configurable pre-attn GEMM knob (#41443), and tuned default VLLM_MULTI_STREAM_GEMM_TOKEN_THRESHOLD (#41526).
  • BF16 and MXFP8 all-to-all support for FlashInfer one-sided communication (#40960).
  • PTX cvt instruction for faster FP32->FP4 conversion (#41015).
  • Integrated tile kernels (head_compute_mix_kernel) for optimized head computation (#41255).
  • Guard megamoe flag with Pure TP (#41522).
  • Fixed persistent topk cooperative deadlock at TopK=1024 (#41189) and inter-CTA init race on RadixRowState (#41444), with temporary disable of persistent topk as a workaround (#41442).
  • Fixed import error due to AOT compile cache loading (#41090).
  • Fixed torch inductor error (#41135).
  • Fixed repeated RoPE cache initialization (#41148).
  • Fixed missing type conversion for non-streaming tool calls in DSV3.2/V4 (#41198).

Bug Fixes

  • Fixed max_num_batched_token not being captured in CUDA graph (#40734).
  • Fixed num_gpu_blocks_override not accounted for in max_model_len checks (#41069).
  • Auto-disable expandable_segments around cumem memory pool (#40812).
  • Fixed BailingMoE linear layer (#40859) and MLA RoPE rotation for BailingMoE V2.5 (#41185).
  • Fixed reasoning parser kwargs not being passed to structured output (#41199).
  • [ROCm] Fixed input_ids and expert_map args for Quark W4A8 GPT-OSS (#41165).

List of contributors

@​BugenZhao, @​chaunceyjiang, @​gau-nernst, @​ghphotoframe, @​Isotr0py, @​jeejeelee, @​khluu, @​njhill, @​Rohan138, @​wzhao18, @​youkaichao, @​ywang96, @​ZJY0516, @​zixi-qi, @​zyongye

Commits
  • 132765e Revert "[DSv4] Use cvt PTX for FP32->FP4 conversion (#41015)"
  • 43a21e6 Temporary disable persistent topk for Hopper (#41605)
  • f98b274 [DSv4] Tune default value of VLLM_MULTI_STREAM_GEMM_TOKEN_THRESHOLD (#41526)
  • 228d225 [DSV4] Guard megamoe flag with Pure TP (#41522)
  • a4debbd Revert "Temporary disable persistent topk (#41442)"
  • 8749454 [Bugfix] Fix persistent_topk inter-CTA init race on RadixRowState (#41444)
  • 63e6293 Build Python from source instead of using deadsnakes PPA
  • 513b5f8 [DSV4] Add knob to enable pre-attn gemm (#41443)
  • 135722d Revert "."
  • 7758dac [ROCm][Bugfix][GPTOSS]: fix input_ids and expert_map args for quark w4a8 gpto...
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [vllm](https://github.com/vllm-project/vllm) to permit the latest version.
- [Release notes](https://github.com/vllm-project/vllm/releases)
- [Changelog](https://github.com/vllm-project/vllm/blob/main/RELEASE.md)
- [Commits](vllm-project/vllm@v0.20.0...v0.20.1)

---
updated-dependencies:
- dependency-name: vllm
  dependency-version: 0.20.1
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants