Skip to content

[Feature] Make QDP CUDA kernel build targets configurable and future-compatible#1283

Open
viiccwen wants to merge 4 commits intoapache:mainfrom
viiccwen:feature/qdp-cuda-arch-targets
Open

[Feature] Make QDP CUDA kernel build targets configurable and future-compatible#1283
viiccwen wants to merge 4 commits intoapache:mainfrom
viiccwen:feature/qdp-cuda-arch-targets

Conversation

@viiccwen
Copy link
Copy Markdown
Contributor

@viiccwen viiccwen commented Apr 22, 2026

Related Issues

Closes #1282

Changes

  • Bug fix
  • New feature
  • Refactoring
  • Documentation
  • Test
  • CI/CD pipeline
  • Other

Why

QDP's CUDA kernel build currently relies on a small hardcoded set of nvcc -gencode targets. That makes it awkward to keep existing GPUs working while also supporting newer architectures such as Blackwell, and it forces follow-up source edits whenever the desired target mix changes.

This change makes architecture targeting configurable and toolchain-aware so one build configuration can remain usable across current and newer NVIDIA GPU generations without changing QDP runtime behavior.

How

  • derive the default cubin/PTX target set from a project shortlist filtered by the local nvcc supported architecture lists
  • add QDP_CUDA_ARCH_LIST as an explicit override for local builds, CI, and packaging workflows
  • preserve a legacy fallback when nvcc does not expose architecture listing flags
  • validate the change locally on both Ada and Blackwell GPUs and rerun targeted QDP GPU tests

Checklist

  • Added or updated unit tests for all changes
  • Added or updated documentation for all changes

@viiccwen viiccwen force-pushed the feature/qdp-cuda-arch-targets branch 2 times, most recently from 19080c0 to d9b32ce Compare April 22, 2026 17:41
@viiccwen
Copy link
Copy Markdown
Contributor Author

viiccwen commented Apr 22, 2026

cc @ryankert01, @rich7420, need testing on ur local machine.
Already testing in both RTX 4090 and newer architectures Pro 6000 Blackwell. It's all work well.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates qdp-kernels’ CUDA build script to make nvcc -gencode architecture targeting configurable and to better accommodate newer NVIDIA GPU architectures without requiring source edits.

Changes:

  • Introduces an architecture-target model for nvcc -gencode flags (SM and optional PTX targets).
  • Adds QDP_CUDA_ARCH_LIST to explicitly override the CUDA target set at build time.
  • Attempts to derive default targets from nvcc’s supported architecture lists, with a legacy fallback when listing flags are unavailable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread qdp/qdp-kernels/build.rs
Comment thread qdp/qdp-kernels/build.rs
Comment thread qdp/qdp-kernels/build.rs
@viiccwen viiccwen force-pushed the feature/qdp-cuda-arch-targets branch 2 times, most recently from 78b1590 to 4767bd0 Compare April 23, 2026 15:01
@viiccwen
Copy link
Copy Markdown
Contributor Author

viiccwen commented Apr 23, 2026

@ryankert01 sounds like getting-started.md. I'll woke on it.

@ryankert01
Copy link
Copy Markdown
Member

ryankert01 commented Apr 23, 2026

sorry @viiccwen, I thought we have a section say we only support some Nvidia gpus (30, 40), but we turns out don't have this section. So, I think it's not needed.

Also, just a heads up, the 30s gpu server that I have access to is currently compromised. So I might not be able to test it anywhere soon.

@400Ping
Copy link
Copy Markdown
Member

400Ping commented Apr 24, 2026

Agree on writing a section on the current supporting GPUs(Including Nvidia and AMD).

@viiccwen
Copy link
Copy Markdown
Contributor Author

viiccwen commented Apr 24, 2026

Goti it, I think a backend-level section is more accurate than listing specific GPU models.

For NVIDIA, QDP does not maintain a fixed supported-SKU whitelist. The actual CUDA targets generated by the build depend on the installed CUDA toolkit and the local nvcc supported architectures, with QDP selecting from its default architecture shortlist.

For AMD, support similarly depends on the local ROCm environment and the Triton backend used by QDP, rather than a hardcoded model list in the repo.

Because of that, I plan to document this as a “Supported GPU Backends” section instead of list a set of supported GPU models.

@ryankert01 ryankert01 force-pushed the feature/qdp-cuda-arch-targets branch from 57d0197 to 17f2759 Compare May 3, 2026 13:07
Copy link
Copy Markdown
Member

@ryankert01 ryankert01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Make QDP CUDA kernel build targets configurable and future-compatible

4 participants