chore(deps-dev): update vllm requirement from >=0.20.0 to >=0.20.1 by dependabot[bot] · Pull Request #417 · OpenBMB/UltraRAG

dependabot · 2026-05-06T11:26:35Z

Updates the requirements on vllm to permit the latest version.

Release notes

v0.20.1

vLLM v0.20.1

This is a patch release on top of v0.20.0 primarily focused on DeepSeek V4 stabilization and performance improvements, along with several important bug fixes.

DeepSeek V4

Base model support (#41006).

Multi-stream pre-attention GEMM (#41061), configurable pre-attn GEMM knob (#41443), and tuned default VLLM_MULTI_STREAM_GEMM_TOKEN_THRESHOLD (#41526).

BF16 and MXFP8 all-to-all support for FlashInfer one-sided communication (#40960).

PTX cvt instruction for faster FP32->FP4 conversion (#41015).

Integrated tile kernels (head_compute_mix_kernel) for optimized head computation (#41255).

Guard megamoe flag with Pure TP (#41522).

Fixed persistent topk cooperative deadlock at TopK=1024 (#41189) and inter-CTA init race on RadixRowState (#41444), with temporary disable of persistent topk as a workaround (#41442).

Fixed import error due to AOT compile cache loading (#41090).

Fixed torch inductor error (#41135).

Fixed repeated RoPE cache initialization (#41148).

Fixed missing type conversion for non-streaming tool calls in DSV3.2/V4 (#41198).

Bug Fixes

Fixed max_num_batched_token not being captured in CUDA graph (#40734).

Fixed num_gpu_blocks_override not accounted for in max_model_len checks (#41069).

Auto-disable expandable_segments around cumem memory pool (#40812).

Fixed BailingMoE linear layer (#40859) and MLA RoPE rotation for BailingMoE V2.5 (#41185).

Fixed reasoning parser kwargs not being passed to structured output (#41199).

[ROCm] Fixed input_ids and expert_map args for Quark W4A8 GPT-OSS (#41165).

List of contributors

@BugenZhao, @chaunceyjiang, @gau-nernst, @ghphotoframe, @Isotr0py, @jeejeelee, @khluu, @njhill, @Rohan138, @wzhao18, @youkaichao, @ywang96, @ZJY0516, @zixi-qi, @zyongye

Commits

132765e Revert "[DSv4] Use cvt PTX for FP32->FP4 conversion (#41015)"
43a21e6 Temporary disable persistent topk for Hopper (#41605)
f98b274 [DSv4] Tune default value of VLLM_MULTI_STREAM_GEMM_TOKEN_THRESHOLD (#41526)
228d225 [DSV4] Guard megamoe flag with Pure TP (#41522)
a4debbd Revert "Temporary disable persistent topk (#41442)"
8749454 [Bugfix] Fix persistent_topk inter-CTA init race on RadixRowState (#41444)
63e6293 Build Python from source instead of using deadsnakes PPA
513b5f8 [DSV4] Add knob to enable pre-attn gemm (#41443)
135722d Revert "."
7758dac [ROCm][Bugfix][GPTOSS]: fix input_ids and expert_map args for quark w4a8 gpto...
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [vllm](https://github.com/vllm-project/vllm) to permit the latest version. - [Release notes](https://github.com/vllm-project/vllm/releases) - [Changelog](https://github.com/vllm-project/vllm/blob/main/RELEASE.md) - [Commits](vllm-project/vllm@v0.20.0...v0.20.1) --- updated-dependencies: - dependency-name: vllm dependency-version: 0.20.1 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>

dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps-dev): update vllm requirement from >=0.20.0 to >=0.20.1#417

chore(deps-dev): update vllm requirement from >=0.20.0 to >=0.20.1#417
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/vllm-gte-0.20.1

dependabot Bot commented on behalf of github May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot Bot commented on behalf of github May 6, 2026

v0.20.1

vLLM v0.20.1

DeepSeek V4

Bug Fixes

List of contributors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants