Skip to content

[None][fix] Revert backend for Nemotron ViT to TRT-LLM#13191

Open
yechank-nvidia wants to merge 2 commits intoNVIDIA:mainfrom
yechank-nvidia:revert_backend
Open

[None][fix] Revert backend for Nemotron ViT to TRT-LLM#13191
yechank-nvidia wants to merge 2 commits intoNVIDIA:mainfrom
yechank-nvidia:revert_backend

Conversation

@yechank-nvidia
Copy link
Copy Markdown
Collaborator

@yechank-nvidia yechank-nvidia commented Apr 19, 2026

Summary by CodeRabbit

  • Chores
    • Updated the default attention backend for the vision model. Users relying on previous default behavior should explicitly configure their preferred backend in settings.

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
@yechank-nvidia
Copy link
Copy Markdown
Collaborator Author

/bot run

@yechank-nvidia yechank-nvidia changed the title [None][Fix] Revert backend for Nemotron ViT to TRT-LLM [None][fix] Revert backend for Nemotron ViT to TRT-LLM Apr 19, 2026
@github-actions
Copy link
Copy Markdown

👎 Promotion blocked, new vulnerability found

Vulnerability report

Component Vulnerability Description Severity
encode/uvicorn CVE-2020-7694 This affects all versions of package uvicorn. The request logger provided by the package is vulnerable to ASNI escape sequence injection. Whenever any HTTP request is received, the default behaviour of uvicorn is to log its details to either the console or a log file. When attackers request crafted URLs with percent-encoded escape sequences, the logging component will log the URL after it's been processed with urllib.parse.unquote, therefore converting any percent-encoded characters into their single-character equivalent, which can have special meaning in terminal emulators. By requesting URLs with crafted paths, attackers can: * Pollute uvicorn's access logs, therefore jeopardising the integrity of such files. * Use ANSI sequence codes to attempt to interact with the terminal emulator that's displaying the logs (either in real time or from a file). HIGH
encode/uvicorn CVE-2020-7695 Uvicorn before 0.11.7 is vulnerable to HTTP response splitting. CRLF sequences are not escaped in the value of HTTP headers. Attackers can exploit this to add arbitrary headers to HTTP responses, or even return an arbitrary response body, whenever crafted input is used to construct HTTP headers. MEDIUM

@2ez4bz 2ez4bz marked this pull request as ready for review April 19, 2026 16:15
@2ez4bz 2ez4bz requested review from a team as code owners April 19, 2026 16:15
@2ez4bz 2ez4bz enabled auto-merge (squash) April 19, 2026 16:16
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 19, 2026

📝 Walkthrough

Walkthrough

The RADIOVisionModel.__init__ method's vision_attn_backend parameter default value is updated from "FLASHINFER" to "TRTLLM", changing the attention backend selected when callers do not explicitly provide this parameter.

Changes

Cohort / File(s) Summary
Attention Backend Default
tensorrt_llm/_torch/models/modeling_radio.py
Updated vision_attn_backend parameter default from "FLASHINFER" to "TRTLLM" in RADIOVisionModel.__init__.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete; it contains only the placeholder '@coderabbitai summary' without providing actual details about the change, rationale, or test coverage. Replace the placeholder with a proper description explaining why the backend was reverted, what testing was done, and how this affects Nemotron ViT functionality.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: reverting the vision attention backend for Nemotron ViT from FLASHINFER to TRTLLM.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tensorrt_llm/_torch/models/modeling_radio.py (1)

1013-1020: ⚠️ Potential issue | 🟡 Minor

Update stale constructor docstring default.

Line 1013 defaults vision_attn_backend to "TRTLLM", but Line 1019 still documents "FLASHINFER". Please sync docs with behavior.

Suggested patch
-            vision_attn_backend: Attention backend to use for the vision tower. Defaults to "FLASHINFER".
+            vision_attn_backend: Attention backend to use for the vision tower. Defaults to "TRTLLM".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/_torch/models/modeling_radio.py` around lines 1013 - 1020, The
constructor docstring for the RADIO model has a stale default: update the
docstring text that currently says "Defaults to \"FLASHINFER\"" to reflect the
actual parameter default "TRTLLM" for the vision_attn_backend argument (the
vision_attn_backend parameter in the model's __init__/constructor in
modeling_radio.py); keep the rest of the docstring unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tensorrt_llm/_torch/models/modeling_radio.py`:
- Around line 1013-1020: The constructor docstring for the RADIO model has a
stale default: update the docstring text that currently says "Defaults to
\"FLASHINFER\"" to reflect the actual parameter default "TRTLLM" for the
vision_attn_backend argument (the vision_attn_backend parameter in the model's
__init__/constructor in modeling_radio.py); keep the rest of the docstring
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 5122d312-1b8d-45e3-a044-c216d5b863de

📥 Commits

Reviewing files that changed from the base of the PR and between 66431d8 and 76a2e36.

📒 Files selected for processing (1)
  • tensorrt_llm/_torch/models/modeling_radio.py

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
@yechank-nvidia
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44203 [ run ] triggered by Bot. Commit: d6aaa9c Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44203 [ run ] completed with state SUCCESS. Commit: d6aaa9c
/LLM/main/L0_MergeRequest_PR pipeline #34629 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yechank-nvidia
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44232 [ run ] triggered by Bot. Commit: d6aaa9c Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants