API-initiated Eviction
API-initiated eviction is the process by which you use the Eviction API
to create an Eviction
object that triggers graceful pod termination.
You can request eviction by calling the Eviction API directly, or programmatically
using a client of the API server, like the kubectl drain
command. This
creates an Eviction
object, which causes the API server to terminate the Pod.
API-initiated evictions respect your configured PodDisruptionBudgets
and terminationGracePeriodSeconds
.
Using the API to create an Eviction object for a Pod is like performing a
policy-controlled DELETE
operation
on the Pod.
Calling the Eviction API
You can use a Kubernetes language client
to access the Kubernetes API and create an Eviction
object. To do this, you
POST the attempted operation, similar to the following example:
policy/v1
Eviction is available in v1.22+. Use policy/v1beta1
with prior releases.
{
"apiVersion": "policy/v1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
policy/v1
{
"apiVersion": "policy/v1beta1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
Alternatively, you can attempt an eviction operation by accessing the API using
curl
or wget
, similar to the following example:
curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
How API-initiated eviction works
When you request an eviction using the API, the API server performs admission checks and responds in one of the following ways:
200 OK
: the eviction is allowed, theEviction
subresource is created, and the Pod is deleted, similar to sending aDELETE
request to the Pod URL.429 Too Many Requests
: the eviction is not currently allowed because of the configured PodDisruptionBudget. You may be able to attempt the eviction again later. You might also see this response because of API rate limiting.500 Internal Server Error
: the eviction is not allowed because there is a misconfiguration, like if multiple PodDisruptionBudgets reference the same Pod.
If the Pod you want to evict isn't part of a workload that has a
PodDisruptionBudget, the API server always returns 200 OK
and allows the
eviction.
If the API server allows the eviction, the Pod is deleted as follows:
- The
Pod
resource in the API server is updated with a deletion timestamp, after which the API server considers thePod
resource to be terminated. ThePod
resource is also marked with the configured grace period. - The kubelet on the node where the local Pod is running notices that the
Pod
resource is marked for termination and starts to gracefully shut down the local Pod. - While the kubelet is shutting the Pod down, the control plane removes the Pod from Endpoint and EndpointSlice objects. As a result, controllers no longer consider the Pod as a valid object.
- After the grace period for the Pod expires, the kubelet forcefully terminates the local Pod.
- The kubelet tells the API server to remove the
Pod
resource. - The API server deletes the
Pod
resource.
Troubleshooting stuck evictions
In some cases, your applications may enter a broken state, where the Eviction
API will only return 429
or 500
responses until you intervene. This can
happen if, for example, a ReplicaSet creates pods for your application but new
pods do not enter a Ready
state. You may also notice this behavior in cases
where the last evicted Pod had a long termination grace period.
If you notice stuck evictions, try one of the following solutions:
- Abort or pause the automated operation causing the issue. Investigate the stuck application before you restart the operation.
- Wait a while, then directly delete the Pod from your cluster control plane instead of using the Eviction API.
What's next
- Learn how to protect your applications with a Pod Disruption Budget.
- Learn about Node-pressure Eviction.
- Learn about Pod Priority and Preemption.