Skip to content

DRA: Sharing Affinity for DRA Consumable Capacity #5981

@ashvindeodhar

Description

@ashvindeodhar

Enhancement Description

KEP-5075 (Consumable Capacity) enables device sharing via allowMultipleAllocations and capacity counters. However, it assumes all claims are fungible—any claim can share the device with any other claim. Real-world sharing is often constrained. For example: "Share this NIC across 16 pods, but only if the pods request the same subnet." Today, there's no native way to express this constraint.
I'd like to propose adding a sharingAffinity field to the DRA device spec that allows drivers to declare which claim config parameters must match for sharing to be allowed.

The current workaround is painful where drivers must-

  1. Publish devices with capacity=1 as "placeholders"
  2. Wait for the first claim to determine the affinity value (e.g., subnet)
  3. Dynamically update device attributes and expand capacity
  4. Contract back to placeholder when all claims release.

This adds complexity to drivers, introduces race conditions, and requires the scheduler to re-evaluate devices after attribute changes.
Adding sharing affinity to the device spec would allow us to express the constraint in a native way.
I believe it would also improve the scheduling throughput since pods with matching affinity values can be scheduled immediately.

Other potential use cases:

  • GPUs: Share across pods in same tenant/namespace
  • FPGAs: Share only if pods use same bitstream
  • RDMA: Share only for same partition key

/assign @ashvindeodhar
/cc @johnbelamaric @pohly @sunya-ch

  • One-line enhancement description (can be used as a release note): Enable DRA devices to constrain sharing based on matching claim config parameters (sharing affinity)
  • Kubernetes Enhancement Proposal: TBD
  • Discussion Link:
  • PRs by stage and milestone:
    • Alpha - v1.xx
      • KEP (k/enhancements) update PR(s):
      • Code (k/k) update PR(s):
      • Docs (k/website) update PR(s):

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

Metadata

Metadata

Assignees

Labels

sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.wg/device-managementCategorizes an issue or PR as relevant to WG Device Management.

Type

No type

Projects

Status

🏗 In progress

Status

Needs Triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions