-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[None][perf] Drop cubin and Eliminate ~6s FMHA JIT recompile in eager generation by aligning kernel selection with CUDA graph warmup
#13505
opened Apr 27, 2026 by
yunruis
Contributor
Loading…
1 task done
[None][test] Waive failed cases for main in QA CI
#13504
opened Apr 27, 2026 by
crazydemo
Collaborator
Loading…
1 task done
[None][feat] Add encoder_max_batch_size & encoder_max_num_tokens to TorchLlmArgs
#13503
opened Apr 27, 2026 by
yechank-nvidia
Collaborator
Loading…
[https://nvbugs/6115036][fix] Fix NVFP4 engine size estimation and attention DP batch size in trtllm-bench
#13498
opened Apr 27, 2026 by
hyukn
Collaborator
Loading…
1 task done
[https://nvbugs/6114727][fix] Unwaive deepseek r1 fp4 v2 grace_blackwell r1 fp4 v2 tep4 mtp3 1k1k
#13496
opened Apr 27, 2026 by
chenfeiz0326
Collaborator
Loading…
1 task done
[Draft] Add NIXL transfer release cancellation hook
Community want to contribute
PRs initiated from Community
[https://nvbugs/6093715][fix] AutoDeploy: skip nvfp4 test pre-blackwell
#13494
opened Apr 27, 2026 by
galagam
Collaborator
Loading…
1 task done
[None][fix] Update CI Agg test's mpi2 to mpix
#13491
opened Apr 27, 2026 by
chenfeiz0326
Collaborator
Loading…
1 task done
[https://nvbugs/6114711][fix] Add
"kimi_k2": "kimi_k2" and "kimi_k25": "kimi_k2" to `MODEL_TYPE_TO_REASONI
#13490
opened Apr 27, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][fix] Use one mamba slot sentinel to save memory
#13489
opened Apr 27, 2026 by
Wanli-Jiang
Collaborator
•
Draft
1 task done
[None][fix] Port KV cache V2 follow-up fixes
#13488
opened Apr 27, 2026 by
yizhang-nv
Member
Loading…
1 task done
[None][fix] visual_gen UlyssesAttention: pass post-A2A seq_len to inner backend
#13486
opened Apr 27, 2026 by
karljang
Collaborator
Loading…
2 tasks done
[https://nvbugs/6037654][fix] Set DeepEP low-latency token limit for qwen3 CI to prevent OOM
#13484
opened Apr 27, 2026 by
byshiue
Collaborator
Loading…
1 task done
[None][test] refresh test constraints
#13482
opened Apr 27, 2026 by
crazydemo
Collaborator
Loading…
1 task done
[TRTLLM-13429][feat] Switch DeepSeek/NemotronH/Qwen3/Qwen3.5-MoE to sharding-IR canonical models
#13478
opened Apr 26, 2026 by
greg-kwasniewski1
Collaborator
Loading…
1 of 3 tasks
[None][perf] Scheme X L2-aware dispatcher and PDL launchers for sparse-attention GVR Top-K
#13477
opened Apr 26, 2026 by
longcheng-nv
Collaborator
Loading…
5 tasks done
[https://nvbugs/6094068][fix] Cap Mamba cache max_batch_size for memory and add CUDA P2P check for DeepEP
#13474
opened Apr 26, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/5973199][fix] Fall back to SCM_RIGHTS for MNNVL fd exchange when pidfd_getfd is blocked
#13473
opened Apr 26, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6105768][fix] ** Runtime GPU detection inside the test function: when
total_memory < 80 GiB
#13471
opened Apr 26, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6017720][fix] Fix moe backend mismatch on Blackwell in perf test.
#13470
opened Apr 26, 2026 by
dominicshanshan
Collaborator
Loading…
1 task done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.