Kubernetes Liveness Probes - Examples & Common Pitfalls

Levent Ogut
Minute Read

Kubernetes has disrupted traditional deployment methods and has become very popular. Although it is a great platform to deploy to, it brings complexity and challenges as well. Kubernetes manages nodes and workloads seamlessly, and one of the great features of this containerized deployment platform is that of self-healing. For self-healing on the container level, we need health checks called probes in Kubernetes unless we depend on exit codes.

Liveness probes check if the pod is healthy, and if the pod is deemed unhealthy, it will trigger a restart; this action is different than the action of Readiness Probes I discussed in my previous post.

Let's look at the components of the probes and dive into how to configure and troubleshoot Liveness Probes.

Kubernetes Probes Series

Probes

Probes are health checks that are executed by kubelet.

All probes have five parameters that are crucial to configure.

  • initialDelaySeconds: Time to wait after the container starts. (default: 0)
  • periodSeconds: Probe execution frequency (default: 10)
  • timeoutSeconds: Time to wait for the reply (default: 1)
  • successThreshold: Number of successful probe executions to mark the container healthy (default: 1)
  • failureThreshold: Number of failed probe executions to mark the container unhealthy (default: 3)
  • You need to analyze your application's behavior to set these probe parameters.

    There are three types of probes:

    Exec Probe

    Exec probe executes a command inside the container without a shell. The command's exit status determines a healthy state - zero is healthy; anything else is unhealthy.

    livenessProbe:
      initialDelaySeconds: 1
      periodSeconds: 5
      timeoutSeconds: 1
      successThreshold: 1
      failureThreshold: 1
      exec:
        command:
          - cat
          - /etc/nginx/nginx.conf
    

    TCP Probe

    TCP probe checks if a TCP connection can be opened on the port specified. An open port is deemed a success, closed port or reset are deemed unsuccessful.

    livenessProbe:
      initialDelaySeconds: 1
      periodSeconds: 5
      timeoutSeconds: 1
      successThreshold: 1
      failureThreshold: 1
      tcpSocket:
        host:
        port: 80
    

    HTTP Probe

    HTTP probe makes an HTTP call, and the status code determines the healthy state, between including 200 and excluding 400 is deemed success. Any status code apart from those mentioned is deemed unhealthy.

    Here are HTTP Probes additional parameters to configure.

  • host: IP address to connect to (default: pod IP)
  • scheme: HTTP scheme (default: HTTP)
  • path: HTTP path to call to
  • httpHeaders: Any custom headers you want to send.
  • port: Connection port.
  • Tip: If Host header is required, than use httpHeader.

    An example of an HTTP probe.

            livenessProbe:
              initialDelaySeconds: 1
              periodSeconds: 2
              timeoutSeconds: 1
              successThreshold: 1
              failureThreshold: 1
              httpGet:
                host:
                scheme: HTTP
                path: /
                httpHeaders:
                - name: Host
                  value: myapplication1.com
                port: 80
              initialDelaySeconds: 5
              periodSeconds: 5
    

    Liveness Probes in Kubernetes

    Kubelet executes liveness probes to see if the pod needs a restart. For example, let's say we have a microservice written in Go, and this microservice has some bugs on some part of the code, which causes a freeze in runtime. To avoid hitting the bug, we can configure a liveness probe to determine if the microservice is in a frozen state. This way, the microservice container will be restarted and come to a pristine condition.

    If your application gracefully exits when encountering such an issue, you won't necessarily need to configure liveness probes, but there can still be bugs you don't know about. The pod will be restarted as per the configured/default restart policy.

    Common Pitfalls for Liveness Probes

    Probes only determine the health by the probe answers, and they are not aware of the system dynamics of our microservice/application. If for any reason, probe replies are delayed for more than periodSeconds times failureThreshold microservice/application will be determined unhealthy, and a restart of the pod will be triggered. Hence it is important to configure the parameters per application behavior.

    Cascading Failures

    Similar to readiness probes, liveness probes also can create a cascading failure if you misconfigure it. If the health endpoint has external dependencies or any other condition that can prevent an answer to be delivered, it can create a cascading failure; therefore, it is of paramount importance to configure the probe considering this behavior.

    Crash Loop

    Let's assume that our application needs to read a large amount of data into cache once in a while; unresponsiveness at this time also might cause a false positive because the probe might fail. In this case, failure of the liveness probe will restart the container, and most probably, it will enter a continuous cycle of restarts. In such a scenario a Readiness Probe might be more suitable to use, the pod will only be removed from service to execute the maintenance tasks, and once it is ready to take traffic, it can start responding to the probes.

    Liveness endpoints on our microservice -that probes will hit- should check absolute minimum requirements that shows the application is running. This way, liveness checks would succeed, and the pod will not be restarted, and we ensure the service traffic flows as it should.

    Example: Sample Nginx Deployment

    We will deploy Nginx as a sample app. below is the deployment and service configuration.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: k8s-probes
      labels:
        app: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
            - name: nginx
              image: nginx
              ports:
                - containerPort: 80
              livenessProbe:
                initialDelaySeconds: 1
                periodSeconds: 2
                timeoutSeconds: 1
                successThreshold: 1
                failureThreshold: 1
                httpGet:
                  host:
                  scheme: HTTP
                  path: /
                  httpHeaders:
                    - name: Host
                      value: myapplication1.com
                  port: 80
    

    Write this configuration to a file called k8s-probes-deployment.yaml, and apply it with kubectl apply -f k8s-probes-deployment.yaml command.

    apiVersion: v1
    kind: Service
    metadata:
      labels:
        app: nginx
      name: nginx
      namespace: default
    spec:
      ports:
        - name: nginx-http-port
          port: 80
      selector:
        app: nginx
      sessionAffinity: None
      type: NodePort
    

    Also, write this configuration to a file called k8s-probes-svc.yaml and apply it with this command:

    kubectl apply -f k8s-probes-svc.yaml
    

    Troubleshooting Liveness Probes

    There is no specific endpoint for the Liveness Probe, and we should use kubectl describe pods <POD_NAME> command to see events and current status.

    kubectl get pods
    

    Here we can see our pod is in a running state, and it is ready to receive traffic.

    NAME                         READY   STATUS    RESTARTS   AGE
    k8s-probes-7d979f58c-vd2rv   1/1     Running   0          6s
    

    Let's check the applied configuration.

     kubectl describe pods k8s-probes-7d979f58c-vd2rv | grep Liveness
    

    Here we can see the parameters we have configured.

        Liveness:       http-get http://:80/ delay=5s timeout=1s period=5s #success=1 #failure=1
    

    Let's look at the events:

    Events:
      Type    Reason     Age   From               Message----------------
      Normal  Scheduled  45s   default-scheduler  Successfully assigned default/k8s-probes-7d979f58c-vd2rv to k8s-probes
      Normal  Pulling    44s   kubelet            Pulling image "nginx"
      Normal  Pulled     43s   kubelet            Successfully pulled image "nginx" in 1.117208685s
      Normal  Created    43s   kubelet            Created container nginx
      Normal  Started    43s   kubelet            Started container nginx
    

    As you can see, there is no indication of failure nor success; for success conditions, there will be no event recorded.

    Now let's change livenessProbe.httpGet.path to "/do-not-exists," and take a look at the pod status.

    kubectl get pods
    

    After changing the path, liveness probes will fail, and the container will be restarted.

    NAME                          READY   STATUS    RESTARTS   AGE
    k8s-probes-595bcfdf57-428jt   1/1     Running   4          74s
    

    We can see that container has been restarted four times.

    Let's look at the events.

    ...
    Events:
      Type     Reason     Age                From               Message----------------
      Normal   Scheduled  53s                default-scheduler  Successfully assigned default/k8s-probes-595bcfdf57-428jt to k8s-probes
      Normal   Pulled     50s                kubelet            Successfully pulled image "nginx" in 1.078926208s
      Normal   Pulled     42s                kubelet            Successfully pulled image "nginx" in 978.826238ms
      Normal   Pulled     32s                kubelet            Successfully pulled image "nginx" in 971.627126ms
      Normal   Pulling    23s (x4 over 51s)  kubelet            Pulling image "nginx"
      Normal   Pulled     22s                kubelet            Successfully pulled image "nginx" in 985.155098ms
      Normal   Created    22s (x4 over 50s)  kubelet            Created container nginx
      Normal   Started    22s (x4 over 50s)  kubelet            Started container nginx
      Warning  Unhealthy  13s (x4 over 43s)  kubelet            Liveness probe failed: HTTP probe failed with statuscode: 404
      Normal   Killing    13s (x4 over 43s)  kubelet            Container nginx failed liveness probe, will be restarted
      Warning  BackOff    13s                kubelet            Back-off restarting failed container
    

    As you can see above, "Liveness probe failed: HTTP probe failed with status code: 404", indicates probe failed with HTTP code 404; the status code will also aid in troubleshooting. Just after that, kubelet informs us that it will restart the container.

    Conclusion

    Kubernetes liveness probes are life savers when our application is in an undetermined state; they return the application into a pristine condition by restarting the container. However, it is very important that they need to be configured correctly. Of course, there is no one correct way; it all depends on your application and how you want Kubernetes to act in each particular failure scenario. Set values accordingly and test the values through live case scenarios.

    Further Reading

    Photo by Mario Caruso on [Unsplash](

    Sign up for our newsletter

    Be the first to know about new features, announcements and industry insights.