[None][chore] Update flashinfer-python from 0.6.6 to 0.6.8rc1 by yihwang-nv · Pull Request #13064 · NVIDIA/TensorRT-LLM

yihwang-nv · 2026-04-15T04:25:13Z

Summary

Bump flashinfer-python from 0.6.6 to 0.6.8rc1
Bump nvidia-cutlass-dsl from 4.3.4 to 4.4.2 (required by flashinfer-python 0.6.8rc1 >=4.4.2)
Add nvidia-cutlass-dsl-libs-base 4.4.2 to poetry.lock (new transitive dependency)
Updated version pins in requirements.txt, security_scanning/pyproject.toml, security_scanning/poetry.lock, and ATTRIBUTIONS-Python.md

Test plan

pip install -r requirements.txt installs successfully
pytest tests/unittest/_torch/flashinfer/ -v
pytest tests/unittest/_torch/attention/test_flashinfer_attention.py -v
CI pre-merge passes

Summary by CodeRabbit

Chores

Updated CUDA-related package dependencies to newer versions for improved compatibility and performance.

Bump flashinfer-python dependency to 0.6.8rc1. Also update nvidia-cutlass-dsl from 4.3.4 to 4.4.2 (required by flashinfer >=4.4.2). Updated version pins in requirements.txt, security_scanning/pyproject.toml, security_scanning/poetry.lock, and ATTRIBUTIONS-Python.md. Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-15T04:26:09Z

/bot run --stage-list "Build-Docker-Images"

coderabbitai · 2026-04-15T04:28:42Z

📝 Walkthrough

Walkthrough

This change updates two CUDA-related package dependencies across three configuration files: flashinfer-python from 0.6.6 to 0.6.8rc1 and nvidia-cutlass-dsl to 4.4.2 across the project.

Changes

Cohort / File(s)	Summary
Dependency Version Updates `ATTRIBUTIONS-Python.md`, `requirements.txt`, `security_scanning/pyproject.toml`	Updated `flashinfer-python` to 0.6.8rc1 and `nvidia-cutlass-dsl` to 4.4.2 across attribution records, main requirements, and security scanning configuration.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: updating flashinfer-python from 0.6.6 to 0.6.8rc1, which aligns with the primary focus of the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check	✅ Passed	The PR description clearly explains the changes (version bumps), rationale (compatibility requirements), and provides comprehensive test coverage details.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@requirements.txt`:
- Line 57: requirements.txt currently pins flashinfer-python==0.6.8rc1 which is
not on PyPI; either revert the pin to a published version (e.g., 0.6.7.post3) or
add the FlashInfer nightly wheel index to your install config (e.g., add
--extra-index-url https://flashinfer.ai/whl/nightly/ in CI and developer docs)
and update CI/pip config accordingly; also run compatibility tests and review
tensorrt_llm/_torch/attention_backend/flashinfer.py plus any callers for API
changes between 0.6.6/0.6.7 and the nightly to ensure no breaking changes before
keeping 0.6.8rc1.

In `@security_scanning/pyproject.toml`:
- Line 58: Replace the non-existent dependency "flashinfer-python (==0.6.8rc1)"
in pyproject.toml with an available release (e.g., "flashinfer-python
(==0.6.7.post3)"), then regenerate any lockfiles (poetry lock / pip-compile) and
verify installation to ensure the package resolves correctly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8c184ec0-702c-44f0-a56a-0484b3a8d283

📥 Commits

Reviewing files that changed from the base of the PR and between d09ed1e and 43dd843.

⛔ Files ignored due to path filters (1)

security_scanning/poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

ATTRIBUTIONS-Python.md
requirements.txt
security_scanning/pyproject.toml

tensorrt-cicd · 2026-04-15T04:31:52Z

PR_Github #43382 [ run ] triggered by Bot. Commit: 43dd843 Link to invocation

Force-reinstall nvidia-cutlass-dsl and nvidia-cutlass-dsl-libs-base in the Docker build to replace the stale 4.3.5 from the base image with 4.4.2. Add nvidia-cutlass-dsl>=4.4.2 to constraints.txt. Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-15T04:34:40Z

/bot run --stage-list "Build-Docker-Images"

Force-reinstall nvidia-cutlass-dsl and nvidia-cutlass-dsl-libs-base in the Docker build to replace the stale 4.3.5 from the base image with 4.4.2. Add nvidia-cutlass-dsl>=4.4.2 to constraints.txt. Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-15T04:39:27Z

/bot run --stage-list "Build-Docker-Images"

tensorrt-cicd · 2026-04-15T04:44:11Z

PR_Github #43382 [ run ] completed with state ABORTED. Commit: 43dd843

Link to invocation

tensorrt-cicd · 2026-04-15T04:46:26Z

PR_Github #43386 [ run ] triggered by Bot. Commit: f3b96ee Link to invocation

Point current_image_tags.properties to the CI tritondevel images built from PR NVIDIA#13064 (flashinfer + nvidia-cutlass-dsl upgrade). Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-15T06:17:39Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-15T06:23:25Z

PR_Github #43410 [ run ] triggered by Bot. Commit: 3ba87d1 Link to invocation

The DLFW base image (pytorch:26.02-py3) ships nvidia-cutlass-dsl 4.3.5. When pip upgrades to 4.4.2 in-place, it corrupts shared namespace dirs. Add explicit uninstall + rm -rf cleanup before tensorrt_llm wheel install. Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-15T08:54:06Z

/bot run

tensorrt-cicd · 2026-04-15T09:00:35Z

PR_Github #43459 [ run ] triggered by Bot. Commit: fcfbc7e Link to invocation

This reverts commit 0dd321a59fc067e5fd3124f1fb5c6b8aba1d7ad3.

pip installs dependency packages (nvidia-cutlass-dsl-libs-base) before uninstalling the old meta-wheel (nvidia-cutlass-dsl). Since both write to the same nvidia_cutlass_dsl/ directory, the uninstall step removes files that the deps just installed, breaking the package. Add scripts/clean_site_packages.py that uninstalls known problematic packages and removes leftover site-packages fragments before install. Call it from test_pip_install.py before both wheel and editable installs. This avoids Docker image changes — the cleanup runs at CI test time. Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-15T10:01:47Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-15T10:08:07Z

PR_Github #43474 [ run ] triggered by Bot. Commit: 4ae0955 Link to invocation

tensorrt-cicd · 2026-04-15T10:08:10Z

PR_Github #43459 [ run ] completed with state ABORTED. Commit: fcfbc7e

Link to invocation

tensorrt-cicd · 2026-04-15T14:52:36Z

PR_Github #43474 [ run ] completed with state SUCCESS. Commit: 4ae0955
/LLM/main/L0_MergeRequest_PR pipeline #33993 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

…lass-dsl 4.4.2 PipelineAsync.producer_tail is decorated with @dsl_user_op in cutlass-dsl 4.4.2 and forwards loc/ip kwargs to producer_acquire. The overrides in custom_pipeline.py did not accept these, raising DSLRuntimeError in test_fp4_linear_cute_dsl. Add loc=None, ip=None keyword-only parameters to producer_acquire, producer_commit, consumer_release, and producer_tail across PipelineTmaUmma, PipelineUmmaAsync, and PipelineCpAsyncUmma, and thread them through to the inner sync_object and cute.arch calls. Signed-off-by: Yihan Wang <yihwang@nvidia.com>

Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-18T17:19:02Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-18T17:25:45Z

PR_Github #44129 [ run ] triggered by Bot. Commit: 82b22a8 Link to invocation

tensorrt-cicd · 2026-04-19T03:05:28Z

PR_Github #44129 [ run ] completed with state SUCCESS. Commit: 82b22a8
/LLM/main/L0_MergeRequest_PR pipeline #34556 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

yihwang-nv · 2026-04-19T10:47:05Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-19T10:54:02Z

PR_Github #44179 [ run ] triggered by Bot. Commit: 82b22a8 Link to invocation

tensorrt-cicd · 2026-04-19T11:08:09Z

PR_Github #44179 [ run ] completed with state FAILURE. Commit: 82b22a8
/LLM/main/L0_MergeRequest_PR pipeline #34606 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

yihwang-nv · 2026-04-20T02:27:14Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-20T02:33:52Z

PR_Github #44259 [ run ] triggered by Bot. Commit: 82b22a8 Link to invocation

yihwang-nv · 2026-04-20T02:47:00Z

/bot run --stage-list "Build-Docker-Images"

tensorrt-cicd · 2026-04-20T02:49:31Z

PR_Github #44259 [ run ] completed with state FAILURE. Commit: 82b22a8
/LLM/main/L0_MergeRequest_PR pipeline #34680 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

tensorrt-cicd · 2026-04-20T02:53:39Z

PR_Github #44264 [ run ] triggered by Bot. Commit: 82b22a8 Link to invocation

Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-20T05:05:33Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-20T05:11:35Z

PR_Github #44302 [ run ] triggered by Bot. Commit: 31da856 Link to invocation

tensorrt-cicd · 2026-04-20T05:28:39Z

PR_Github #44302 [ run ] completed with state FAILURE. Commit: 31da856
/LLM/main/L0_MergeRequest_PR pipeline #34724 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

yihwang-nv · 2026-04-20T05:48:11Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-20T05:54:21Z

PR_Github #44335 [ run ] triggered by Bot. Commit: 31da856 Link to invocation

tensorrt-cicd · 2026-04-20T06:32:05Z

PR_Github #44335 [ run ] completed with state FAILURE. Commit: 31da856
/LLM/main/L0_MergeRequest_PR pipeline #34753 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

yihwang-nv · 2026-04-20T06:44:23Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-20T06:51:39Z

PR_Github #44367 [ run ] triggered by Bot. Commit: 31da856 Link to invocation

tensorrt-cicd · 2026-04-20T07:25:02Z

PR_Github #44367 [ run ] completed with state FAILURE. Commit: 31da856
/LLM/main/L0_MergeRequest_PR pipeline #34784 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Point current_image_tags.properties to the CI tritondevel images built from PR NVIDIA#13064 (flashinfer + nvidia-cutlass-dsl upgrade). Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv · 2026-04-20T07:41:40Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-20T07:47:35Z

PR_Github #44387 [ run ] triggered by Bot. Commit: 31da856 Link to invocation

tensorrt-cicd · 2026-04-20T22:50:39Z

PR_Github #44387 [ run ] completed with state SUCCESS. Commit: 31da856
/LLM/main/L0_MergeRequest_PR pipeline #34802 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

yihwang-nv requested a review from a team as a code owner April 15, 2026 04:25

github-actions bot assigned yihwang-nv Apr 15, 2026

coderabbitai bot reviewed Apr 15, 2026

View reviewed changes

Comment thread requirements.txt Outdated

Comment thread security_scanning/pyproject.toml Outdated

yihwang-nv requested review from a team as code owners April 15, 2026 04:33

yihwang-nv requested review from niukuo and venkywonka April 15, 2026 04:33

[None][chore] Update CI image tags to PR-13064 staging images

3ba87d1

Point current_image_tags.properties to the CI tritondevel images built from PR NVIDIA#13064 (flashinfer + nvidia-cutlass-dsl upgrade). Signed-off-by: Yihan Wang <yihwang@nvidia.com>

yihwang-nv requested a review from a team as a code owner April 15, 2026 06:15

yihwang-nv added 2 commits April 15, 2026 03:01

Revert "[None][chore] Update CI image tags to PR-13064 staging images"

8243eb0

This reverts commit 0dd321a59fc067e5fd3124f1fb5c6b8aba1d7ad3.

yihwang-nv added 2 commits April 18, 2026 09:48

[None][chore] Revert security_scanning/poetry.lock to origin/main

82b22a8

Signed-off-by: Yihan Wang <yihwang@nvidia.com>

wenmingw approved these changes Apr 20, 2026

View reviewed changes

yihwang-nv added 2 commits April 20, 2026 12:53

Merge branch 'main' into yihwang-nv/update_flashinfer_0.6.8rc1

8fc097d

[None][chore] Update CI image tags to PR-13064 build 23 staging images

31da856

Signed-off-by: Yihan Wang <yihwang@nvidia.com>

Wanli-Jiang mentioned this pull request Apr 20, 2026

[None][feat] Stack PRs for sweep perfing #13205

Draft

Conversation

yihwang-nv commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Chores

Uh oh!

yihwang-nv commented Apr 15, 2026

Uh oh!

coderabbitai bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

yihwang-nv commented Apr 15, 2026

Uh oh!

yihwang-nv commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

yihwang-nv commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

yihwang-nv commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

yihwang-nv commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

yihwang-nv commented Apr 18, 2026

Uh oh!

tensorrt-cicd commented Apr 18, 2026

Uh oh!

tensorrt-cicd commented Apr 19, 2026

Uh oh!

yihwang-nv commented Apr 19, 2026

Uh oh!

tensorrt-cicd commented Apr 19, 2026

Uh oh!

tensorrt-cicd commented Apr 19, 2026

Uh oh!

yihwang-nv commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

yihwang-nv commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

yihwang-nv commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

yihwang-nv commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

yihwang-nv commented Apr 15, 2026 •

edited

Loading

coderabbitai bot commented Apr 15, 2026 •

edited

Loading