Gang Scheduling
Kubernetes v1.35 [alpha](disabled by default)Gang scheduling ensures that a group of Pods are scheduled on an "all-or-nothing" basis. If the cluster cannot accommodate the entire group (or a defined minimum number of Pods), none of the Pods are bound to a node.
This feature depends on the Workload API.
Ensure the GenericWorkload
feature gate and the scheduling.k8s.io/v1alpha1
API group are enabled in the cluster.
How it works
When the GangScheduling plugin is enabled, the scheduler alters the lifecycle for Pods belonging
to a gang pod group policy within
a Workload.
The process follows these steps independently for each pod group and its replica key:
The scheduler holds Pods in the
PreEnqueuephase until:- The referenced Workload object is created.
- The referenced pod group exists in a Workload.
- The number of Pods that have been created for the specific group
is at least equal to the
minCount.
Pods do not enter the active scheduling queue until all of these conditions are met.
Once the quorum is met, the scheduler attempts to find placements for all Pods in the group. All assigned Pods wait at the
WaitOnPermitgate during this process. Note that in the Alpha phase of this feature, finding a placement is based on pod-by-pod scheduling, rather than a single-cycle approach.If the scheduler finds valid placements for at least
minCountPods, it allows all of them to be bound to their assigned nodes. If it cannot find placements for the entire group within a fixed timeout of 5 minutes, none of the Pods are scheduled. Instead, they are moved to the unschedulable queue to wait for cluster resources to free up, allowing other workloads to be scheduled in the meantime.
What's next
- Learn about the Workload API.
- See how to reference a Workload in a Pod.