Kubernetes Probes: Startup, Liveness, Readiness

Levent Ogut
Minute Read

Kubernetes has been disruptive due to the scalability, velocity, portability, and observability it adds to cloud deployments. While it brings a whole ecosystem of great features and options and eases complex deployment, it also has its own challenges. One of the great features Kubernetes has brought us is that of high availability. There are many high availability options in Kubernetes; in this article, we will discuss high availability options used for the application/microservice itself.

Kubernetes Probes Series

Pods - the smallest deployable units in Kubernetes - are scheduled once the declarative configuration is applied. Kube-scheduler is responsible for the calculation and schedule; once the schedule is accepted, it is in a controlled and calculated environment, and it is deemed service ready or not by the pod conditions. By using startup, readiness, and liveness probes, we can control when a pod should be deemed started, ready for service, or live. We will explore these conditions and triggers.

Kubernetes Probes Diagram

Pod and Container Status

Pods have phases and conditions; containers have states. These status properties can and will be changed based on probe results, so let's explore them.

Pod Phases

Pod status object includes a phase field. This phase-field tells Kubernetes and us that wherein the execution cycle a pod is.

  • Pending: Accepted by the cluster, containers are not set up yet.
  • Running: At least one container is in a running, starting, or restarting state.
  • Succeeded: All of the containers exited with a status code of zero; the pod will not be restarted.
  • Failed: All containers have terminated and at least one container exited with a status code of non-zero.
  • Unknown: The state of the pod can not be determined.
  • Pod Conditions

    As well as pod phases, there are pod conditions. These also give information about the state the pod is in.

  • PodScheduled: A Node has been successfully selected to schedule the pod, and scheduling is completed.
  • ContainersReady: All the containers are ready.
  • Initialized: Init containers are started.
  • Ready: The pod is able to serve requests; hence it needs to be included in the service and load balancers.
  • We can view the pod conditions via kubectl describe pods <POD_NAME> command.

    kubectl describe pods <POD_NAME>
    

    Sample output is as follows:

    ...
    Conditions:
      Type              Status
      Initialized       True
      Ready             True
      ContainersReady   True
      PodScheduled      True
    ...
    

    Container States

    The container has three simple states.

  • Waiting: Required processes are running for a successful startup.
  • Running: The container is executing.
  • Terminated: Container started execution and finished by either success or failure.
  • Exploring Status on Pod Object

    We can see the pod conditions and container states from a Pod object by issuing Kubernetes get pods -o yaml command.

      conditions:
        - lastProbeTime: null
          lastTransitionTime: "2021-02-08T11:11:53Z"
          status: "True"
          type: Initialized
        - lastProbeTime: null
          lastTransitionTime: "2021-02-08T11:14:20Z"
          status: "True"
          type: Ready
        - lastProbeTime: null
          lastTransitionTime: "2021-02-08T11:14:20Z"
          status: "True"
          type: ContainersReady
        - lastProbeTime: null
          lastTransitionTime: "2021-02-08T11:11:52Z"
          status: "True"
          type: PodScheduled
      containerStatuses:
        - containerID: containerd://7fc67a850ba439f64ecb51a129a2d7dcbc4a3402b253daa3a6827787f7c80e40
          image: docker.io/library/nginx:latest
          imageID: docker.io/library/nginx@sha256:10b8cc432d56da8b61b070f4c7d2543a9ed17c2b23010b43af434fd40e2ca4aa
          lastState:
            terminated:
              containerID: containerd://c4416e69b7348a7e7be3f7046dc9745dfb38ba537e5b8c06da5020c67b12b3d8
              exitCode: 137
              finishedAt: "2021-02-08T11:14:52Z"
              reason: Error
              startedAt: "2021-02-08T11:14:05Z"
          name: nginx
          ready: true
          restartCount: 1
          started: true
          state:
            running:
              startedAt: "2021-02-08T11:16:28Z"
      hostIP: x.x.x.x
      phase: Running
      podIP: 10.1.239.205
      podIPs:
        - ip: 10.1.239.205
      qosClass: BestEffort
      startTime: "2021-02-08T11:11:53Z"
    

    If you prefer JSON, you can use kubectl get pods <POD_NAME> -o jsonpath='{.status}' | jq

    {
      "conditions": [
        {
          "lastProbeTime": null,
          "lastTransitionTime": "2021-02-08T11:11:53Z",
          "status": "True",
          "type": "Initialized"
        },
        {
          "lastProbeTime": null,
          "lastTransitionTime": "2021-02-08T11:14:20Z",
          "status": "True",
          "type": "Ready"
        },
        {
          "lastProbeTime": null,
          "lastTransitionTime": "2021-02-08T11:14:20Z",
          "status": "True",
          "type": "ContainersReady"
        },
        {
          "lastProbeTime": null,
          "lastTransitionTime": "2021-02-08T11:11:52Z",
          "status": "True",
          "type": "PodScheduled"
        }
      ],
      "containerStatuses": [
        {
          "containerID": "containerd://7fc67a850ba439f64ecb51a129a2d7dcbc4a3402b253daa3a6827787f7c80e40",
          "image": "docker.io/library/nginx:latest",
          "imageID": "docker.io/library/nginx@sha256:10b8cc432d56da8b61b070f4c7d2543a9ed17c2b23010b43af434fd40e2ca4aa",
          "lastState": {
            "terminated": {
              "containerID": "containerd://c4416e69b7348a7e7be3f7046dc9745dfb38ba537e5b8c06da5020c67b12b3d8",
              "exitCode": 137,
              "finishedAt": "2021-02-08T11:14:52Z",
              "reason": "Error",
              "startedAt": "2021-02-08T11:14:05Z"
            }
          },
          "name": "nginx",
          "ready": true,
          "restartCount": 1,
          "started": true,
          "state": {
            "running": {
              "startedAt": "2021-02-08T11:16:28Z"
            }
          }
        }
      ],
      "hostIP": "x.x.x.x",
      "phase": "Running",
      "podIP": "10.1.239.205",
      "podIPs": [
        {
          "ip": "10.1.239.205"
        }
      ],
      "qosClass": "BestEffort",
      "startTime": "2021-02-08T11:11:53Z"
    }
    

    Probes in Kubernetes

    Kubernetes provides probes -health checks- to monitor and act on the state or condition of the pods, to make sure only healthy pods serve traffic.

    Kubelet is the responsible component for running the health checks, updating the API Server with the relevant information.

    Probe Handlers

    There are three available handlers that can cover almost any scenario.

    Exec Action

    ExecAction executes a command inside the container; this also is a gateway feature that can handle anything since we can run any executable; this might be a script running several curl requests to determine the status or an executable that connects to an external dependency. Make sure that the executable does not create zombie processes.

    TCP Socket Action

    TCPSocketAction Connects to a defined port to check if the port is open, mostly used for endpoints that are not talking HTTP.HTTP Get Action

    HTTPGetAction sends an HTTP Get request as a probe to the path defined, HTTP response code determines whether the probe is successful or not.

    Common Probe Parameters

    Each type of probe has common configurable fields:

  • initialDelaySeconds: Seconds after the container started and before probes start. (default: 0)
  • periodSeconds: Frequency of the pod. (default: 10)
  • timeoutSeconds: Timeout for the expected response. (default: 1)
  • successThreshold: How many success results received to transition from failure to a healthy state. (default: 1)
  • failureThreshold: How many failed results received to transition from healthy to failure state. (default: 3)
  • As you can see, we can configure probes in detail. For successful probe configuration, we need to analyze the requirements and dependencies of our application/microservice.

    Startup Probes

    If your process requires time to get ready, reading a file, parsing a large configuration, preparing some data, and so on, you should use Startup Probes. If the probe fails, the threshold is exceeded, it will be restarted so the operation can start over. You need to adjust initialDelaySeconds and periodSeconds accordingly to make sure the process has sufficient time to complete. Otherwise, you can find your pod in a loop of restarts.

    Readiness Probes

    If you want to control the traffic sent to the pod, you ought to use readiness probes. Readiness Probes modify Pod Conditions: Ready to change whether the pod should be included in the service and load-balancers. When the probe succeeds enough times (threshold), it means that the pod can receive traffic, and it should be included in the service and load-balancers. If your process has the ability to take itself out of the service for maintenance, reading a large amount of data to be used for the service, etc., again, you ought to use readiness probes. So that pod can signal to kublet via readiness probe that it wants out of the service for a while.

    Liveness Probes

    If your container cannot crash by itself when there is an unexpected error occur, then use liveness probes. Using liveness probes can overcome some of the bugs the process might have. Kublet restarts the pod once the Liveness Probe fails.

    If your process can handle these errors by exiting, you don't need to use liveness probes; however, it is advantageous to accommodate unknown bugs until they are fixed.

    Example: Kubernetes API

    Kubernetes API includes health check endpoints as well: healthz (deprecated), readyz, livez.

    Let's look at the readyz endpoint designed to be used with ready probes.

    kubectl get --raw='/readyz?verbose'
    

    Individual services healths are combined to show health status.

    [+]ping ok
    [+]log ok
    [+]etcd ok
    [+]informer-sync ok
    [+]poststarthook/start-kube-apiserver-admission-initializer ok
    [+]poststarthook/generic-apiserver-start-informers ok
    [+]poststarthook/start-apiextensions-informers ok
    [+]poststarthook/start-apiextensions-controllers ok
    [+]poststarthook/crd-informer-synced ok
    [+]poststarthook/bootstrap-controller ok
    [+]poststarthook/scheduling/bootstrap-system-priority-classes ok
    [+]poststarthook/start-cluster-authentication-info-controller ok
    [+]poststarthook/aggregator-reload-proxy-client-cert ok
    [+]poststarthook/start-kube-aggregator-informers ok
    [+]poststarthook/apiservice-registration-controller ok
    [+]poststarthook/apiservice-status-available-controller ok
    [+]poststarthook/kube-apiserver-autoregistration ok
    [+]autoregister-completion ok
    [+]poststarthook/apiservice-openapi-controller ok
    [+]shutdown ok
    healthz check passed
    

    Let's look at the livez endpoint.

    kubectl get --raw='/livez?verbose'
    

    Individual services healths are combined to show health status.

    [+]ping ok
    [+]log ok
    [+]etcd ok
    [+]poststarthook/start-kube-apiserver-admission-initializer ok
    [+]poststarthook/generic-apiserver-start-informers ok
    [+]poststarthook/start-apiextensions-informers ok
    [+]poststarthook/start-apiextensions-controllers ok
    [+]poststarthook/crd-informer-synced ok
    [+]poststarthook/bootstrap-controller ok
    [+]poststarthook/scheduling/bootstrap-system-priority-classes ok
    [+]poststarthook/start-cluster-authentication-info-controller ok
    [+]poststarthook/aggregator-reload-proxy-client-cert ok
    [+]poststarthook/start-kube-aggregator-informers ok
    [+]poststarthook/apiservice-registration-controller ok
    [+]poststarthook/apiservice-status-available-controller ok
    [+]poststarthook/kube-apiserver-autoregistration ok
    [+]autoregister-completion ok
    [+]poststarthook/apiservice-openapi-controller ok
    healthz check passed
    

    Conclusion

    We have explored Kubernetes probes; they are an essential part of the high availability equation. On the other hand, it is apparent that a misconfiguration can affect our applications'/microservices' availability adversely. It is of paramount importance to configure appropriately and test different scenarios to find the optimal values; we need to think about the stability of the external sources and whether we would include this check on the probe response endpoint. We have seen that Readiness Probe's action is to remove or include the pod in the service and load-balancers, while the liveness probe's action is to restart the pod on enough failures that exceed the threshold. You can find links to previous articles detailing Readiness, Liveness, and Startup Probes in the further reading section.

    Further Reading

    Sign up for our newsletter

    Be the first to know about new features, announcements and industry insights.