Skip to content

[None][chore] Add Dynamo configs to TRTLLM CI - Disagg - Part 2#13168

Open
brb-nv wants to merge 3 commits intoNVIDIA:mainfrom
brb-nv:user/brb/mirror-dynamo-configs-in-trtllm-disagg-part2
Open

[None][chore] Add Dynamo configs to TRTLLM CI - Disagg - Part 2#13168
brb-nv wants to merge 3 commits intoNVIDIA:mainfrom
brb-nv:user/brb/mirror-dynamo-configs-in-trtllm-disagg-part2

Conversation

@brb-nv
Copy link
Copy Markdown
Collaborator

@brb-nv brb-nv commented Apr 17, 2026

Description

This MR adds Dynamo configs to TRTLLM CI to catch issues early. This MR has disagg configs for gb200.

Test Coverage

N/A

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Summary by CodeRabbit

  • Tests
    • Added performance sanity test support for H200 8-GPU configurations
    • Integrated new models (Nemotron Super FP8, Qwen3 variants) into performance benchmarking
    • Added disaggregated end-to-end performance sanity benchmarks with configurable parameters for various concurrency and tensor parallel settings

@brb-nv brb-nv requested review from a team as code owners April 17, 2026 22:43
@brb-nv brb-nv requested review from mzweilz and zeroepoch April 17, 2026 22:43
Comment thread jenkins/L0_Test.groovy Outdated
@brb-nv brb-nv changed the title User/brb/mirror dynamo configs in trtllm disagg part2 [None][chore] Add Dynamo configs to TRTLLM CI - Disagg - Part 2 Apr 17, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 17, 2026

📝 Walkthrough

Walkthrough

This pull request adds support for H200 8-GPU performance sanity testing by introducing a new Groovy stage configuration, model path mappings for two FP8 models, a test-list definition, and three disaggregated performance sanity benchmark configurations with varying model and parameter combinations.

Changes

Cohort / File(s) Summary
Jenkins and Test Configuration
jenkins/L0_Test.groovy, tests/integration/defs/perf/test_perf_sanity.py
Added H200 8-GPU post-merge perf sanity Slurm stage entry in Jenkins config and extended model-to-path mappings with super_fp8 and qwen3_32b_fp8 model entries.
Test List Definition
tests/integration/test_lists/test-db/l0_dgx_h200_perf_sanity.yml
New test-list configuration for H200 8-GPU performance sanity with hardware gating (8 GPUs, h200 wildcard, x86_64, Ubuntu) and three active perf sanity test targets under PyTorch post-merge stage.
Performance Sanity Benchmarks
tests/scripts/perf-sanity/disaggregated/h200_nemotron-super-fp8_*.yaml, tests/scripts/perf-sanity/disaggregated/h200_qwen3-*.yaml
Three new disaggregated performance sanity benchmark configurations for H200 with different model variants (Nemotron Super-FP8, Qwen3 235B A22B-FP8, Qwen3 32B-FP8), parameter combinations, and worker role-specific tensor/pipeline parallel, batch/sequence limits, and KV cache settings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is partially related but vague and unclear. It mentions 'Dynamo configs' and 'disagg' but lacks specificity about what is being added and for which hardware platform. Clarify the title to specify the primary change, such as 'Add H200 and Qwen3 performance sanity configurations for disaggregated inference' or similar to make the purpose immediately clear.
Description check ❓ Inconclusive The description is minimal and does not clearly explain the specific changes. It mentions adding Dynamo configs but refers to 'gb200' when the actual changes involve H200 and specific model configurations (Nemotron Super FP8, Qwen3 32B FP8, Qwen3 235B FP8). Provide a more detailed description that lists the specific models and configurations being added (H200 with Nemotron Super FP8 and Qwen3 models) and clarify the purpose of these additions in the CI pipeline.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/defs/perf/test_perf_sanity.py`:
- Around line 52-54: The SPDX copyright header year at the top of the modified
file needs to be updated to 2026; locate the file by spotting the dictionary
entries like "super_fp8", "qwen3_235b_a22b_fp8", or "qwen3_32b_fp8" and change
the SPDX header line (currently ending in 2025) to end in 2026 so the file
reflects the latest modification year. Ensure the header format matches existing
project headers exactly (SPDX and any NVIDIA notice) and save the file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 42d06aa9-d436-4ed6-a627-847d3e7a757a

📥 Commits

Reviewing files that changed from the base of the PR and between 813d877 and f259d72.

📒 Files selected for processing (6)
  • jenkins/L0_Test.groovy
  • tests/integration/defs/perf/test_perf_sanity.py
  • tests/integration/test_lists/test-db/l0_dgx_h200_perf_sanity.yml
  • tests/scripts/perf-sanity/disaggregated/h200_nemotron-super-fp8_8k1k_con64_ctx1_tp2_gen1_tp2_eplb0_mtp0_ccb-UCX.yaml
  • tests/scripts/perf-sanity/disaggregated/h200_qwen3-235b-a22b-fp8_8k1k_con512_ctx1_tp2_gen1_tep4_eplb0_mtp0_ccb-DEFAULT.yaml
  • tests/scripts/perf-sanity/disaggregated/h200_qwen3-32b-fp8_4k1k_con128_ctx1_tp1_gen1_tp2_eplb0_mtp0_ccb-DEFAULT.yaml

Comment thread tests/integration/defs/perf/test_perf_sanity.py
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
@brb-nv brb-nv force-pushed the user/brb/mirror-dynamo-configs-in-trtllm-disagg-part2 branch from f259d72 to 774b1aa Compare April 18, 2026 00:12
@brb-nv
Copy link
Copy Markdown
Collaborator Author

brb-nv commented Apr 18, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44079 [ run ] triggered by Bot. Commit: 774b1aa Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44079 [ run ] completed with state SUCCESS. Commit: 774b1aa
/LLM/main/L0_MergeRequest_PR pipeline #34509 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@brb-nv
Copy link
Copy Markdown
Collaborator Author

brb-nv commented Apr 19, 2026

/bot run --disable-fail-fast

brb-nv added 2 commits April 19, 2026 22:56
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
@brb-nv
Copy link
Copy Markdown
Collaborator Author

brb-nv commented Apr 20, 2026

bot run --stage-list "DGX_H200-8_GPUs-PyTorch-PerfSanity-Post-Merge-1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants