Understand Pressure Stall Information (PSI) Metrics
Kubernetes v1.34 [beta]
As a beta feature, Kubernetes lets you configure the kubelet to collect Linux kernel
Pressure Stall Information
(PSI) for CPU, memory, and I/O usage. The information is collected at node, pod and container level.
This feature is enabled by default by setting the KubeletPSI
feature gate.
PSI metrics are exposed through two different sources:
- The kubelet's Summary API, which provides PSI data at the node, pod, and container level.
- The
/metrics/cadvisor
endpoint on the kubelet, which exposes PSI metrics in the Prometheus format.
Requirements
Pressure Stall Information requires the following on your Linux nodes:
- The Linux kernel must be version 4.20 or newer.
- The kernel must be compiled with the
CONFIG_PSI=y
option. Most modern distributions enable this by default. You can check your kernel's configuration by runningzgrep CONFIG_PSI /proc/config.gz
. - Some Linux distributions may compile PSI into the kernel but disable it by default. If so, you need to enable it at boot time by adding the
psi=1
parameter to the kernel command line. - The node must be using cgroup v2.
Understanding PSI Metrics
Pressure Stall Information (PSI) metrics are provided for three resources: CPU, memory, and I/O. They are categorized into two main types of pressure: some
and full
.
some
: This value indicates that some tasks (one or more) are stalled on a resource. For example, if some tasks are waiting for I/O, this metric will increase. This can be an early indicator of resource contention.full
: This value indicates that all non-idle tasks are stalled on a resource simultaneously. This signifies a more severe resource shortage, where the entire system is unable to make progress.
Each pressure type provides four metrics: avg10
, avg60
, avg300
, and total
. The avg
values represent the percentage of wall-clock time that tasks were stalled over 10-second, 60-second, and 3-minute moving averages. The total
value is a cumulative counter in microseconds showing the total time tasks have been stalled.
Example Scenarios
You can use a simple Pod with a stress-testing tool to generate resource pressure and observe the PSI metrics. The following examples use the agnhost
container image, which includes the stress
tool.
Generating CPU Pressure
Create a Pod that generates CPU pressure using the stress
utility. This workload will put a heavy load on one CPU core.
Create a file named cpu-pressure-pod.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: cpu-pressure-pod
spec:
restartPolicy: Never
containers:
- name: cpu-stress
image: registry.k8s.io/e2e-test-images/agnhost:2.47
args:
- "stress"
- "--cpus"
- "1"
Apply it to your cluster: kubectl apply -f cpu-pressure-pod.yaml
Observing CPU Pressure
After the Pod is running, you can observe the CPU pressure through either the Summary API or the Prometheus metrics endpoint.
Using the Summary API:
Watch the summary stats for your node. In a separate terminal, run:
# Replace <node-name> with the name of a node in your cluster
kubectl get --raw "/api/v1/nodes/<node-name>/proxy/stats/summary" | jq '.pods[] | select(.podRef.name | contains("cpu-pressure-pod"))'
You will see the some
PSI metrics for CPU increase in the summary API output. The avg10
value for some
pressure should rise above zero, indicating that tasks are spending time stalled on the CPU.
Using the Prometheus metrics endpoint:
Query the /metrics/cadvisor
endpoint to see the container_pressure_cpu_waiting_seconds_total
metric.
# Replace <node-name> with the name of the node where the pod is running
kubectl get --raw "/api/v1/nodes/<node-name>/proxy/metrics/cadvisor" | \
grep 'container_pressure_cpu_waiting_seconds_total{container="cpu-stress",pod="cpu-pressure-pod"}'
The output should show an increasing value, indicating that the container is spending time stalled waiting for CPU resources.
Cleanup
Clean up the Pod when you are finished:
kubectl delete pod cpu-pressure-pod
Generating Memory Pressure
This example creates a Pod that continuously writes to files in the container's writable layer, causing the kernel's page cache to grow and forcing memory reclamation, which generates pressure.
Create a file named memory-pressure-pod.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: memory-pressure-pod
spec:
restartPolicy: Never
containers:
- name: memory-stress
image: registry.k8s.io/e2e-test-images/agnhost:2.47
command: ["/bin/sh", "-c"]
args:
- "i=0; while true; do dd if=/dev/zero of=testfile.$i bs=1M count=50 &>/dev/null; i=$(((i+1)%5)); sleep 0.1; done"
resources:
limits:
memory: "200M"
requests:
memory: "200M"
Apply it to your cluster: kubectl apply -f memory-pressure-pod.yaml
Observing Memory Pressure
Using the Summary API:
In the summary output, you will observe an increase in the full
PSI metrics for memory, indicating that the system is under significant memory pressure.
# Replace <node-name> with the name of a node in your cluster
kubectl get --raw "/api/v1/nodes/<node-name>/proxy/stats/summary" | jq '.pods[] | select(.podRef.name | contains("memory-pressure-pod"))'
Using the Prometheus metrics endpoint:
Query the /metrics/cadvisor
endpoint to see the container_pressure_memory_waiting_seconds_total
metric.
# Replace <node-name> with the name of the node where the pod is running
kubectl get --raw "/api/v1/nodes/<node-name>/proxy/metrics/cadvisor" | \
grep 'container_pressure_memory_waiting_seconds_total{container="memory-stress",pod="memory-pressure-pod"}'
In the output, you will observe an increasing value for the metric, indicating that the system is under significant memory pressure.
Cleanup
Clean up the Pod when you are finished:
kubectl delete pod memory-pressure-pod
Generating I/O Pressure
This Pod generates I/O pressure by repeatedly writing a file to disk and using sync
to flush the data from memory, which creates I/O stalls.
Create a file named io-pressure-pod.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: io-pressure-pod
spec:
restartPolicy: Never
containers:
- name: io-stress
image: registry.k8s.io/e2e-test-images/agnhost:2.47
command: ["/bin/sh", "-c"]
args:
- "while true; do dd if=/dev/zero of=testfile bs=1M count=128 &>/dev/null; sync; rm testfile &>/dev/null; done"
Apply this to your cluster: kubectl apply -f io-pressure-pod.yaml
Observing I/O Pressure
Using the Summary API:
You will see the some
PSI metrics for I/O increase as the Pod continuously writes to disk.
# Replace <node-name> with the name of a node in your cluster
kubectl get --raw "/api/v1/nodes/<node-name>/proxy/stats/summary" | jq '.pods[] | select(.podRef.name | contains("io-pressure-pod"))'
Using the Prometheus metrics endpoint:
Query the /metrics/cadvisor
endpoint to see the container_pressure_io_waiting_seconds_total
metric.
# Replace <node-name> with the name of the node where the pod is running
kubectl get --raw "/api/v1/nodes/<node-name>/proxy/metrics/cadvisor" | \
grep 'container_pressure_io_waiting_seconds_total{container="io-stress",pod="io-pressure-pod"}'
You will see the metric's value increase as the Pod continuously writes to disk.
Cleanup
Clean up the Pod when you are finished:
kubectl delete pod io-pressure-pod
What's next
The task pages for Troubleshooting Clusters discuss how to use a metrics pipeline that rely on these data.