Well-Known Labels, Annotations and Taints

Kubernetes reserves all labels and annotations in the kubernetes.io namespace.

This document serves both as a reference to the values and as a coordination point for assigning values.

kubernetes.io/arch

Example: kubernetes.io/arch=amd64

Used on: Node

The Kubelet populates this with runtime.GOARCH as defined by Go. This can be handy if you are mixing arm and x86 nodes.

kubernetes.io/os

Example: kubernetes.io/os=linux

Used on: Node

The Kubelet populates this with runtime.GOOS as defined by Go. This can be handy if you are mixing operating systems in your cluster (for example: mixing Linux and Windows nodes).

kubernetes.io/metadata.name

Example: kubernetes.io/metadata.name=mynamespace

Used on: Namespaces

When the NamespaceDefaultLabelName feature gate is enabled, the Kubernetes API server sets this label on all namespaces. The label value is set to the name of the namespace.

This is useful if you want to target a specific namespace with a label selector.

beta.kubernetes.io/arch (deprecated)

This label has been deprecated. Please use kubernetes.io/arch instead.

beta.kubernetes.io/os (deprecated)

This label has been deprecated. Please use kubernetes.io/os instead.

kubernetes.io/hostname

Example: kubernetes.io/hostname=ip-172-20-114-199.ec2.internal

Used on: Node

The Kubelet populates this label with the hostname. Note that the hostname can be changed from the "actual" hostname by passing the --hostname-override flag to the kubelet.

This label is also used as part of the topology hierarchy. See topology.kubernetes.io/zone for more information.

controller.kubernetes.io/pod-deletion-cost

Example: controller.kubernetes.io/pod-deletion-cost=10

Used on: Pod

This annotation is used to set Pod Deletion Cost which allows users to influence ReplicaSet downscaling order. The annotation parses into an int32 type.

beta.kubernetes.io/instance-type (deprecated)

Note: Starting in v1.17, this label is deprecated in favor of node.kubernetes.io/instance-type.

node.kubernetes.io/instance-type

Example: node.kubernetes.io/instance-type=m3.medium

Used on: Node

The Kubelet populates this with the instance type as defined by the cloudprovider. This will be set only if you are using a cloudprovider. This setting is handy if you want to target certain workloads to certain instance types, but typically you want to rely on the Kubernetes scheduler to perform resource-based scheduling. You should aim to schedule based on properties rather than on instance types (for example: require a GPU, instead of requiring a g2.2xlarge).

failure-domain.beta.kubernetes.io/region (deprecated)

See topology.kubernetes.io/region.

Note: Starting in v1.17, this label is deprecated in favor of topology.kubernetes.io/region.

failure-domain.beta.kubernetes.io/zone (deprecated)

See topology.kubernetes.io/zone.

Note: Starting in v1.17, this label is deprecated in favor of topology.kubernetes.io/zone.

topology.kubernetes.io/region

Example:

topology.kubernetes.io/region=us-east-1

See topology.kubernetes.io/zone.

topology.kubernetes.io/zone

Example:

topology.kubernetes.io/zone=us-east-1c

Used on: Node, PersistentVolume

On Node: The kubelet or the external cloud-controller-manager populates this with the information as provided by the cloudprovider. This will be set only if you are using a cloudprovider. However, you should consider setting this on nodes if it makes sense in your topology.

On PersistentVolume: topology-aware volume provisioners will automatically set node affinity constraints on PersistentVolumes.

A zone represents a logical failure domain. It is common for Kubernetes clusters to span multiple zones for increased availability. While the exact definition of a zone is left to infrastructure implementations, common properties of a zone include very low network latency within a zone, no-cost network traffic within a zone, and failure independence from other zones. For example, nodes within a zone might share a network switch, but nodes in different zones should not.

A region represents a larger domain, made up of one or more zones. It is uncommon for Kubernetes clusters to span multiple regions, While the exact definition of a zone or region is left to infrastructure implementations, common properties of a region include higher network latency between them than within them, non-zero cost for network traffic between them, and failure independence from other zones or regions. For example, nodes within a region might share power infrastructure (e.g. a UPS or generator), but nodes in different regions typically would not.

Kubernetes makes a few assumptions about the structure of zones and regions:

  1. regions and zones are hierarchical: zones are strict subsets of regions and no zone can be in 2 regions
  2. zone names are unique across regions; for example region "africa-east-1" might be comprised of zones "africa-east-1a" and "africa-east-1b"

It should be safe to assume that topology labels do not change. Even though labels are strictly mutable, consumers of them can assume that a given node is not going to be moved between zones without being destroyed and recreated.

Kubernetes can use this information in various ways. For example, the scheduler automatically tries to spread the Pods in a ReplicaSet across nodes in a single-zone cluster (to reduce the impact of node failures, see kubernetes.io/hostname). With multiple-zone clusters, this spreading behavior also applies to zones (to reduce the impact of zone failures). This is achieved via SelectorSpreadPriority.

SelectorSpreadPriority is a best effort placement. If the zones in your cluster are heterogeneous (for example: different numbers of nodes, different types of nodes, or different pod resource requirements), this placement might prevent equal spreading of your Pods across zones. If desired, you can use homogenous zones (same number and types of nodes) to reduce the probability of unequal spreading.

The scheduler (through the VolumeZonePredicate predicate) also will ensure that Pods, that claim a given volume, are only placed into the same zone as that volume. Volumes cannot be attached across zones.

If PersistentVolumeLabel does not support automatic labeling of your PersistentVolumes, you should consider adding the labels manually (or adding support for PersistentVolumeLabel). With PersistentVolumeLabel, the scheduler prevents Pods from mounting volumes in a different zone. If your infrastructure doesn't have this constraint, you don't need to add the zone labels to the volumes at all.

node.kubernetes.io/windows-build

Example: node.kubernetes.io/windows-build=10.0.17763

Used on: Node

When the kubelet is running on Microsoft Windows, it automatically labels its node to record the version of Windows Server in use.

The label's value is in the format "MajorVersion.MinorVersion.BuildNumber".

service.kubernetes.io/headless

Example: service.kubernetes.io/headless=""

Used on: Service

The control plane adds this label to an Endpoints object when the owning Service is headless.

kubernetes.io/service-name

Example: kubernetes.io/service-name="nginx"

Used on: Service

Kubernetes uses this label to differentiate multiple Services. Used currently for ELB(Elastic Load Balancer) only.

endpointslice.kubernetes.io/managed-by

Example: endpointslice.kubernetes.io/managed-by="controller"

Used on: EndpointSlices

The label is used to indicate the controller or entity that manages an EndpointSlice. This label aims to enable different EndpointSlice objects to be managed by different controllers or entities within the same cluster.

endpointslice.kubernetes.io/skip-mirror

Example: endpointslice.kubernetes.io/skip-mirror="true"

Used on: Endpoints

The label can be set to "true" on an Endpoints resource to indicate that the EndpointSliceMirroring controller should not mirror this resource with EndpointSlices.

service.kubernetes.io/service-proxy-name

Example: service.kubernetes.io/service-proxy-name="foo-bar"

Used on: Service

The kube-proxy has this label for custom proxy, which delegates service control to custom proxy.

experimental.windows.kubernetes.io/isolation-type

Example: experimental.windows.kubernetes.io/isolation-type: "hyperv"

Used on: Pod

The annotation is used to run Windows containers with Hyper-V isolation. To use Hyper-V isolation feature and create a Hyper-V isolated container, the kubelet should be started with feature gates HyperVContainer=true and the Pod should include the annotation experimental.windows.kubernetes.io/isolation-type=hyperv.

Note: You can only set this annotation on Pods that have a single container.

ingressclass.kubernetes.io/is-default-class

Example: ingressclass.kubernetes.io/is-default-class: "true"

Used on: IngressClass

When a single IngressClass resource has this annotation set to "true", new Ingress resource without a class specified will be assigned this default class.

kubernetes.io/ingress.class (deprecated)

Note: Starting in v1.18, this annotation is deprecated in favor of spec.ingressClassName.

alpha.kubernetes.io/provided-node-ip

Example: alpha.kubernetes.io/provided-node-ip: "10.0.0.1"

Used on: Node

The kubelet can set this annotation on a Node to denote its configured IPv4 address.

When kubelet is started with the "external" cloud provider, it sets this annotation on the Node to denote an IP address set from the command line flag (--node-ip). This IP is verified with the cloud provider as valid by the cloud-controller-manager.

batch.kubernetes.io/job-completion-index

Example: batch.kubernetes.io/job-completion-index: "3"

Used on: Pod

The Job controller in the kube-controller-manager sets this annotation for Pods created with Indexed completion mode.

kubectl.kubernetes.io/default-container

Example: kubectl.kubernetes.io/default-container: "front-end-app"

The value of the annotation is the container name that is default for this Pod. For example, kubectl logs or kubectl exec without -c or --container flag will use this default container.

endpoints.kubernetes.io/over-capacity

Example: endpoints.kubernetes.io/over-capacity:warning

Used on: Endpoints

In Kubernetes clusters v1.21 (or later), the Endpoints controller adds this annotation to an Endpoints resource if it has more than 1000 endpoints. The annotation indicates that the Endpoints resource is over capacity.

The taints listed below are always used on Nodes

node.kubernetes.io/not-ready

Example: node.kubernetes.io/not-ready:NoExecute

The node controller detects whether a node is ready by monitoring its health and adds or removes this taint accordingly.

node.kubernetes.io/unreachable

Example: node.kubernetes.io/unreachable:NoExecute

The node controller adds the taint to a node corresponding to the NodeCondition Ready being Unknown.

node.kubernetes.io/unschedulable

Example: node.kubernetes.io/unschedulable:NoSchedule

The taint will be added to a node when initializing the node to avoid race condition.

node.kubernetes.io/memory-pressure

Example: node.kubernetes.io/memory-pressure:NoSchedule

The kubelet detects memory pressure based on memory.available and allocatableMemory.available observed on a Node. The observed values are then compared to the corresponding thresholds that can be set on the kubelet to determine if the Node condition and taint should be added/removed.

node.kubernetes.io/disk-pressure

Example: node.kubernetes.io/disk-pressure:NoSchedule

The kubelet detects disk pressure based on imagefs.available, imagefs.inodesFree, nodefs.available and nodefs.inodesFree(Linux only) observed on a Node. The observed values are then compared to the corresponding thresholds that can be set on the kubelet to determine if the Node condition and taint should be added/removed.

node.kubernetes.io/network-unavailable

Example: node.kubernetes.io/network-unavailable:NoSchedule

This is initially set by the kubelet when the cloud provider used indicates a requirement for additional network configuration. Only when the route on the cloud is configured properly will the taint be removed by the cloud provider.

node.kubernetes.io/pid-pressure

Example: node.kubernetes.io/pid-pressure:NoSchedule

The kubelet checks D-value of the size of /proc/sys/kernel/pid_max and the PIDs consumed by Kubernetes on a node to get the number of available PIDs that referred to as the pid.available metric. The metric is then compared to the corresponding threshold that can be set on the kubelet to determine if the node condition and taint should be added/removed.

node.cloudprovider.kubernetes.io/uninitialized

Example: node.cloudprovider.kubernetes.io/uninitialized:NoSchedule

Sets this taint on a node to mark it as unusable, when kubelet is started with the "external" cloud provider, until a controller from the cloud-controller-manager initializes this node, and then removes the taint.

node.cloudprovider.kubernetes.io/shutdown

Example: node.cloudprovider.kubernetes.io/shutdown:NoSchedule

If a Node is in a cloud provider specified shutdown state, the Node gets tainted accordingly with node.cloudprovider.kubernetes.io/shutdown and the taint effect of NoSchedule.

Last modified April 01, 2021 at 8:09 PM PST: Improve docs for NamespaceDefaultLabelName (6c935b6f9)