vllm-project / vllm Public

NotificationsYou must be signed in to change notification settings
Fork 7.3k
Star 47k

Code
Issues 1.8k
Pull requests 622
Actions
Projects 11
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 47 Milestones 1

New pull request New

622 Open 8,235 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[FEAT] [ROCm] [V1]: Add AITER biased group topk for DeepSeekV3

#17955 openedMay 11, 2025by vllmellm

Loading…

[Bugfix] When use set rope_scaling should replace it

#17953 openedMay 11, 2025by lengrongfu

Loading…

Refactor ci/build needs-rebase tpu

Related to Google TPUs

#17950 openedMay 10, 2025by yarongmu-google • Draft

[Bugfix] Add revision to transformers.Auto*.from_pretrained processors ready

ONLY add when PR is ready to merge/full CI is needed

#17948 openedMay 10, 2025by xinli-centml

Loading…

[v1] Support multiple KV cache groups in GPU model runner tpu

Related to Google TPUs

#17945 openedMay 10, 2025by heheda12345

Loading…

[doc] list the hf downloaded models documentation

Improvements or additions to documentation

#17940 openedMay 10, 2025by reidliu41

Loading…

[WIP] Fix Misleading Error Messages

#17938 openedMay 10, 2025by mengbingrock

Loading…

[BugFix] Correct max_model_len derivation from config.json for Mistral format

#17937 openedMay 10, 2025by princepride

Loading…

[doc] update lora doc documentation

Improvements or additions to documentation

ready

ONLY add when PR is ready to merge/full CI is needed

#17936 openedMay 10, 2025by reidliu41

Loading…

[Bugfix] Avoid repeatedly creating dummy data during engine startup multi-modality

Related to multi-modality (#4194)

#17935 openedMay 10, 2025by DarkLight1337

Loading…

[kernel] integrate permute/unpermute kernel into deepgemm moe

#17934 openedMay 10, 2025by CalebDu

Loading…

[Misc][RFC] Add automated profiling sweep and heatmap visualization tools

#17933 openedMay 10, 2025by ConstBob

Loading…

[WIP] automatically bind CPU OMP Threads of a rank to CPU ids of a NUMA node. ci/build

#17930 openedMay 10, 2025by louie-tsai

Loading…

[BugFix] Set default random seed to 0 for V1

#17929 openedMay 10, 2025by WoosukKwon

Loading…

[Frontend] [Core] Add Tensorizer support for LoRA adapter serialization and deserialization documentation

Improvements or additions to documentation

#17926 openedMay 9, 2025by sangstar

Loading…

[WIP][Misc] Add Ray Prometheus logger to V1 v1

#17925 openedMay 9, 2025by eicherseiji

Loading…

TESTING CI test completion - no need to merge. ci/build needs-rebase

#17921 openedMay 9, 2025by Alexei-V-Ivanov-AMD

Loading…

[Hardware][Intel-Gaudi] enable text embedding for Intel-Gaudi backend

#17920 openedMay 9, 2025by libinta

Loading…

[Draft] Support PIL Image in llm.chat frontend

#17919 openedMay 9, 2025by ywang96 • Draft

use ceil_div in cutlass block scaling shape check

#17918 openedMay 9, 2025by IwakuraRein

Loading…

WIP: fix_llama4_tool_call documentation

Improvements or additions to documentation

frontend tool-calling

#17917 openedMay 9, 2025by wukaixingxp • Draft

[Bugfix][V1] Only get input embeddings w/ multi-modal models if first PP ready

ONLY add when PR is ready to merge/full CI is needed

#17916 openedMay 9, 2025by jinhuang12

Loading…

[Misc] Add compressed-tensors NVFP4A16 emulation support quantization ready

ONLY add when PR is ready to merge/full CI is needed

#17914 openedMay 9, 2025by dsikka

Loading…

[BugFix][AMD] Compatible for AITER lib after 04/20

#17912 openedMay 9, 2025by qli88

Loading…

[Bugfix] add missing function params to rocm_aiter_mla.py v1

#17911 openedMay 9, 2025by davidxia • Draft

Previous 1 2 3 4 5 … 24 25 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly