Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/api-reference/operator-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -824,8 +824,8 @@ _Appears in:_

| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| `topologyName` _string_ | TopologyName is the name of the ClusterTopologyBinding resource to use for topology-aware scheduling.<br />If topologyConstraint is set, topologyName and packDomain must both be specified.<br />Immutable after creation. | | |
| `packDomain` _[TopologyDomain](#topologydomain)_ | PackDomain specifies the topology domain for grouping replicas.<br />Controls placement constraint for EACH individual replica instance.<br />Must reference a domain in the topology levels defined in the ClusterTopologyBinding CR name as set in TopologyName<br />Example: "rack" means each replica independently placed within one rack.<br />Note: Does NOT constrain all replicas to the same rack together.<br />Different replicas can be in different topology domains. | | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-z][a-z0-9-]*$` <br /> |
| `topologyName` _string_ | TopologyName is the name of the ClusterTopologyBinding resource to use for topology-aware scheduling.<br />Setting TopologyName may be optional if the name can be inherited from a higher level scope.<br />When TopologyName is specified at a PCS/PCSG/PCLQ resource constraint, it will also be inherited<br />as the default ClusterTopologyBinding name on all sub-resources, unless overridden by another TopologyName<br />at a sub-resource.<br />For example, setting TopologyName at a PCS level makes it optional for child PCSG or PCLQ levels<br />when the sub-resources reuse the same ClusterTopologyBinding.<br />Immutable after creation. | | |
| `packDomain` _[TopologyDomain](#topologydomain)_ | PackDomain specifies the topology domain for grouping replicas.<br />Controls placement constraint for EACH individual replica instance.<br />Must reference a domain in the topology levels defined in the ClusterTopologyBinding named by TopologyName.<br />Example: "rack" means each replica independently placed within one rack.<br />Note: Does NOT constrain all replicas to the same rack together.<br />Different replicas can be in different topology domains. | | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-z][a-z0-9-]*$` <br /> |


#### TopologyDomain
Expand Down
72 changes: 41 additions & 31 deletions docs/proposals/244-topology-aware-scheduling/README.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion operator/api/common/constants/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ const (
ConditionReasonTopologyNotFound = "TopologyNotFound"

// ConditionReasonTopologyNameMissing is the reason when a PodCliqueSet has incomplete
// topology constraints or otherwise cannot resolve an explicit topology reference.
// topology constraints or otherwise cannot resolve one effective topology reference.
ConditionReasonTopologyNameMissing = "TopologyNameMissing"

// ConditionReasonTopologyAwareSchedulingDisabled is the reason when a PodCliqueSet has topology
Expand Down
30 changes: 21 additions & 9 deletions operator/api/core/v1alpha1/crds/grove.io_podcliquesets.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9317,7 +9317,7 @@ spec:
description: |-
PackDomain specifies the topology domain for grouping replicas.
Controls placement constraint for EACH individual replica instance.
Must reference a domain in the topology levels defined in the ClusterTopologyBinding CR name as set in TopologyName
Must reference a domain in the topology levels defined in the ClusterTopologyBinding named by TopologyName.
Example: "rack" means each replica independently placed within one rack.
Note: Does NOT constrain all replicas to the same rack together.
Different replicas can be in different topology domains.
Expand All @@ -9328,12 +9328,16 @@ spec:
topologyName:
description: |-
TopologyName is the name of the ClusterTopologyBinding resource to use for topology-aware scheduling.
If topologyConstraint is set, topologyName and packDomain must both be specified.
Setting TopologyName may be optional if the name can be inherited from a higher level scope.
When TopologyName is specified at a PCS/PCSG/PCLQ resource constraint, it will also be inherited
as the default ClusterTopologyBinding name on all sub-resources, unless overridden by another TopologyName
at a sub-resource.
For example, setting TopologyName at a PCS level makes it optional for child PCSG or PCLQ levels
when the sub-resources reuse the same ClusterTopologyBinding.
Immutable after creation.
type: string
required:
- packDomain
- topologyName
type: object
required:
- name
Expand Down Expand Up @@ -9975,7 +9979,7 @@ spec:
description: |-
PackDomain specifies the topology domain for grouping replicas.
Controls placement constraint for EACH individual replica instance.
Must reference a domain in the topology levels defined in the ClusterTopologyBinding CR name as set in TopologyName
Must reference a domain in the topology levels defined in the ClusterTopologyBinding named by TopologyName.
Example: "rack" means each replica independently placed within one rack.
Note: Does NOT constrain all replicas to the same rack together.
Different replicas can be in different topology domains.
Expand All @@ -9986,12 +9990,16 @@ spec:
topologyName:
description: |-
TopologyName is the name of the ClusterTopologyBinding resource to use for topology-aware scheduling.
If topologyConstraint is set, topologyName and packDomain must both be specified.
Setting TopologyName may be optional if the name can be inherited from a higher level scope.
When TopologyName is specified at a PCS/PCSG/PCLQ resource constraint, it will also be inherited
as the default ClusterTopologyBinding name on all sub-resources, unless overridden by another TopologyName
at a sub-resource.
For example, setting TopologyName at a PCS level makes it optional for child PCSG or PCLQ levels
when the sub-resources reuse the same ClusterTopologyBinding.
Immutable after creation.
type: string
required:
- packDomain
- topologyName
type: object
required:
- cliqueNames
Expand Down Expand Up @@ -10771,7 +10779,7 @@ spec:
description: |-
PackDomain specifies the topology domain for grouping replicas.
Controls placement constraint for EACH individual replica instance.
Must reference a domain in the topology levels defined in the ClusterTopologyBinding CR name as set in TopologyName
Must reference a domain in the topology levels defined in the ClusterTopologyBinding named by TopologyName.
Example: "rack" means each replica independently placed within one rack.
Note: Does NOT constrain all replicas to the same rack together.
Different replicas can be in different topology domains.
Expand All @@ -10782,12 +10790,16 @@ spec:
topologyName:
description: |-
TopologyName is the name of the ClusterTopologyBinding resource to use for topology-aware scheduling.
If topologyConstraint is set, topologyName and packDomain must both be specified.
Setting TopologyName may be optional if the name can be inherited from a higher level scope.
When TopologyName is specified at a PCS/PCSG/PCLQ resource constraint, it will also be inherited
as the default ClusterTopologyBinding name on all sub-resources, unless overridden by another TopologyName
at a sub-resource.
For example, setting TopologyName at a PCS level makes it optional for child PCSG or PCLQ levels
when the sub-resources reuse the same ClusterTopologyBinding.
Immutable after creation.
type: string
required:
- packDomain
- topologyName
type: object
required:
- cliques
Expand Down
13 changes: 9 additions & 4 deletions operator/api/core/v1alpha1/podcliqueset.go
Original file line number Diff line number Diff line change
Expand Up @@ -264,13 +264,18 @@ type PodCliqueTemplateSpec struct {
// TopologyConstraint defines topology placement requirements.
type TopologyConstraint struct {
// TopologyName is the name of the ClusterTopologyBinding resource to use for topology-aware scheduling.
// If topologyConstraint is set, topologyName and packDomain must both be specified.
// Setting TopologyName may be optional if the name can be inherited from a higher level scope.
// When TopologyName is specified at a PCS/PCSG/PCLQ resource constraint, it will also be inherited
// as the default ClusterTopologyBinding name on all sub-resources, unless overridden by another TopologyName
// at a sub-resource.
// For example, setting TopologyName at a PCS level makes it optional for child PCSG or PCLQ levels
// when the sub-resources reuse the same ClusterTopologyBinding.
// Immutable after creation.
// +required
TopologyName string `json:"topologyName"`
// +optional
TopologyName string `json:"topologyName,omitempty"`
// PackDomain specifies the topology domain for grouping replicas.
// Controls placement constraint for EACH individual replica instance.
// Must reference a domain in the topology levels defined in the ClusterTopologyBinding CR name as set in TopologyName
// Must reference a domain in the topology levels defined in the ClusterTopologyBinding named by TopologyName.
// Example: "rack" means each replica independently placed within one rack.
// Note: Does NOT constrain all replicas to the same rack together.
// Different replicas can be in different topology domains.
Expand Down
42 changes: 37 additions & 5 deletions operator/e2e/tests/topology_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,14 @@ import (
"github.com/ai-dynamo/grove/operator/e2e/grove/topology"
"github.com/ai-dynamo/grove/operator/e2e/setup"
"github.com/ai-dynamo/grove/operator/e2e/testctx"
testutils "github.com/ai-dynamo/grove/operator/test/utils"
"github.com/ai-dynamo/grove/operator/e2e/waiter"
groveschedulerv1alpha1 "github.com/ai-dynamo/grove/scheduler/api/core/v1alpha1"
kaischedulingv2alpha2 "github.com/kai-scheduler/KAI-scheduler/pkg/apis/scheduling/v2alpha2"
"github.com/samber/lo"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/uuid"
"sigs.k8s.io/controller-runtime/pkg/client"
)

Expand Down Expand Up @@ -1636,12 +1638,14 @@ func Test_TAS20_PCSTopologyLevelsUnavailableCondition(t *testing.T) {
Logger.Info("TAS20: PCS TopologyLevelsUnavailable Condition test completed successfully!")
}

// Test_TAS21_ClusterTopologyValidationWebhook verifies that the ClusterTopologyBinding validating webhook
// rejects invalid topology definitions and invalid schedulerTopologyReferences.
func Test_TAS21_ClusterTopologyValidationWebhook(t *testing.T) {
// Test_TAS21_TopologyValidationWebhooks verifies validation behavior for topology-related resources:
// 1. The ClusterTopologyBinding validating webhook rejects invalid topology definitions and scheduler references.
// 2. The PodCliqueSet validating webhook allows a child topologyConstraint without topologyName when it
// can inherit from the PCS topologyConstraint and the referenced ClusterTopologyBinding exists.
func Test_TAS21_TopologyValidationWebhooks(t *testing.T) {
ctx := context.Background()

Logger.Info("1. Initialize a Grove cluster for ClusterTopologyBinding webhook validation testing")
Logger.Info("1. Initialize a Grove cluster for topology webhook validation testing")
tc, cleanup := testctx.PrepareTest(ctx, t, 0)
defer cleanup()

Expand Down Expand Up @@ -1722,5 +1726,33 @@ func Test_TAS21_ClusterTopologyValidationWebhook(t *testing.T) {
})
}

Logger.Info("TAS21: ClusterTopologyBinding validating webhook test completed successfully!")
topologyVerifier := topology.NewTopologyVerifier(tc.Client, Logger)

Logger.Info("2. Ensure grove-topology ClusterTopologyBinding exists for PodCliqueSet validation")
ensureGroveTopology(ctx, t, topologyVerifier)

Logger.Info("3. Create PodCliqueSet with PCS topologyName and child inherited topology constraint")
pcs := testutils.NewPodCliqueSetBuilder("tas21-pcs-optional-topology-name", "default", uuid.NewUUID()).
WithReplicas(1).
WithTopologyConstraint(&corev1alpha1.TopologyConstraint{
TopologyName: "grove-topology",
PackDomain: corev1alpha1.TopologyDomainZone,
}).
WithPodCliqueTemplateSpec(
testutils.NewPodCliqueTemplateSpecBuilder("worker").
WithReplicas(1).
WithRoleName("worker-role").
WithMinAvailable(1).
WithTopologyConstraint(&corev1alpha1.TopologyConstraint{
PackDomain: corev1alpha1.TopologyDomainHost,
}).
Build(),
).
Build()

if err := tc.Client.Create(ctx, pcs); err != nil {
t.Fatalf("Expected PodCliqueSet create to succeed with inherited optional topologyName, got: %v", err)
}

Logger.Info("TAS21: Topology validation webhook test completed successfully!")
}
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@ import (
)

var (
// ErrTopologyNameMissing indicates that a topology constraint is incomplete and does not specify both topologyName and packDomain.
ErrTopologyNameMissing = errors.New("topology constraints require both topologyName and packDomain")
// ErrTopologyNameMissing indicates that a constrained level has no effective topologyName after inheritance is applied.
ErrTopologyNameMissing = errors.New("topology constraints require an explicit or inherited topologyName")
// ErrPackDomainMissing indicates that a topologyConstraint exists but does not specify packDomain.
ErrPackDomainMissing = errors.New("topology constraints require packDomain")
// ErrMultipleTopologyNamesUnsupported indicates that topology constraints within a single PCS reference different topology names.
ErrMultipleTopologyNamesUnsupported = errors.New("multiple topology names within a single PodCliqueSet are not supported")
)
Expand All @@ -49,25 +51,120 @@ func HasAnyTopologyConstraint(pcs *grovecorev1alpha1.PodCliqueSet) bool {
return false
}

// ResolveTopologyNameForPodCliqueSet resolves the single topologyName used by all explicit topology constraints in the PCS.
// ResolveEffectiveTopologyNameForConstraint resolves the topologyName for a single constrained level.
func ResolveEffectiveTopologyNameForConstraint(explicitTopologyName, inheritedTopologyName string) (string, error) {
if explicitTopologyName != "" {
if inheritedTopologyName != "" && explicitTopologyName != inheritedTopologyName {
return "", ErrMultipleTopologyNamesUnsupported
}
return explicitTopologyName, nil
}
if inheritedTopologyName != "" {
return inheritedTopologyName, nil
}
return "", ErrTopologyNameMissing
}

// ResolveEffectiveTopologyNameForPodCliqueSet resolves the single effective topologyName for the PCS after inheritance is applied.
// Callers that need to distinguish "no topology constraints at all" from invalid topology constraints
// must first call HasAnyTopologyConstraint.
func ResolveTopologyNameForPodCliqueSet(pcs *grovecorev1alpha1.PodCliqueSet) (string, error) {
topologyNames := sets.New[string]()
for _, tc := range getAllTopologyConstraintsInPodCliqueSet(pcs) {
if tc.TopologyName == "" || tc.PackDomain == "" {
return "", ErrTopologyNameMissing
func ResolveEffectiveTopologyNameForPodCliqueSet(pcs *grovecorev1alpha1.PodCliqueSet) (string, error) {
resolvedTopologyName := ""
recordTopologyName := func(topologyName string) error {
if resolvedTopologyName == "" {
resolvedTopologyName = topologyName
return nil
}
if resolvedTopologyName != topologyName {
return ErrMultipleTopologyNamesUnsupported
}
return nil
}

pcsEffectiveTopologyName := ""
if tc := pcs.Spec.Template.TopologyConstraint; tc != nil {
if tc.PackDomain == "" {
return "", ErrPackDomainMissing
}
effectiveTopologyName, err := ResolveEffectiveTopologyNameForConstraint(tc.TopologyName, "")
if err != nil {
return "", err
}
pcsEffectiveTopologyName = effectiveTopologyName
if err := recordTopologyName(effectiveTopologyName); err != nil {
return "", err
}
}

pcsgTopologyNameByCliqueName := make(map[string]string)
for _, pcsgConfig := range pcs.Spec.Template.PodCliqueScalingGroupConfigs {
if pcsgConfig.TopologyConstraint == nil {
continue
}
if pcsgConfig.TopologyConstraint.PackDomain == "" {
return "", ErrPackDomainMissing
}
effectiveTopologyName, err := ResolveEffectiveTopologyNameForConstraint(pcsgConfig.TopologyConstraint.TopologyName, pcsEffectiveTopologyName)
if err != nil {
return "", err
}
if err := recordTopologyName(effectiveTopologyName); err != nil {
return "", err
}
for _, cliqueName := range pcsgConfig.CliqueNames {
if _, exists := pcsgTopologyNameByCliqueName[cliqueName]; !exists {
pcsgTopologyNameByCliqueName[cliqueName] = effectiveTopologyName
}
}
topologyNames.Insert(tc.TopologyName)
}
switch topologyNames.Len() {
case 0:

for _, pclqTemplateSpec := range pcs.Spec.Template.Cliques {
if pclqTemplateSpec.TopologyConstraint == nil {
continue
}
if pclqTemplateSpec.TopologyConstraint.PackDomain == "" {
return "", ErrPackDomainMissing
}

inheritedTopologyName := pcsEffectiveTopologyName
if pcsgTopologyName, exists := pcsgTopologyNameByCliqueName[pclqTemplateSpec.Name]; exists {
inheritedTopologyName = pcsgTopologyName
}

effectiveTopologyName, err := ResolveEffectiveTopologyNameForConstraint(pclqTemplateSpec.TopologyConstraint.TopologyName, inheritedTopologyName)
if err != nil {
return "", err
}
if err := recordTopologyName(effectiveTopologyName); err != nil {
return "", err
}
}

if resolvedTopologyName == "" {
return "", ErrTopologyNameMissing
case 1:
return sets.List(topologyNames)[0], nil
default:
return "", ErrMultipleTopologyNamesUnsupported
}
return resolvedTopologyName, nil
}

// FindExplicitTopologyNameForPodCliqueSet returns one explicit topologyName from the PCS.
// It is intended for callers operating on already-validated PCS objects where all explicit topologyName
// values are expected to match. Callers that need to distinguish "no topology constraints at all" from
// missing explicit topologyName values must first call HasAnyTopologyConstraint.
func FindExplicitTopologyNameForPodCliqueSet(pcs *grovecorev1alpha1.PodCliqueSet) (string, error) {
if tc := pcs.Spec.Template.TopologyConstraint; tc != nil && tc.TopologyName != "" {
return tc.TopologyName, nil
}
for _, pcsgConfig := range pcs.Spec.Template.PodCliqueScalingGroupConfigs {
if tc := pcsgConfig.TopologyConstraint; tc != nil && tc.TopologyName != "" {
return tc.TopologyName, nil
}
}
for _, pclqTemplateSpec := range pcs.Spec.Template.Cliques {
if tc := pclqTemplateSpec.TopologyConstraint; tc != nil && tc.TopologyName != "" {
return tc.TopologyName, nil
}
}
return "", ErrTopologyNameMissing
}

// GetUniqueTopologyDomainsInPodCliqueSet returns all unique, non-empty pack domains referenced by the PCS.
Expand Down
Loading
Loading