Using Kubernetes Ephemeral Containers for Troubleshooting

Levent Ogut

Dec 1, 2021

6 min read

Get Started Now

Containers and the ecosystem around them changed how engineers deploy, maintain, and troubleshoot workloads. But debugging an application on a Kubernetes cluster can be daunting at times, as you might not find the tools you need in the container. Many engineers build containers with distroless images based on slimmed-to-the-bone distributions where there’s not even a package manager or a shell. In the deep end, some teams use scratch as a base image and only add the files the application needs to run. Some reasons why this is a common practice are:

To have a smaller attack vector area.
To have faster-scanning performance.
Reduced image size.
To have a faster build and CD/CI cycle.
To have fewer dependencies.

These stripped-down base images don’t include the tools you would use to troubleshoot an application or its dependencies. This is the perfect opportunity for the Kubernetes ephemeral containers feature to shine. Ephemeral containers allow you to create a container image that includes all the debugging tools you might need. Once there is a need for debugging, you would deploy the ephemeral container into the running pod of your choice.

You can not add a container to a deployed pod; you need to update the spec, and resources are re-created. However, an ephemeral container can be added to an existing pod to allow you to troubleshoot a live issue.

Ephemeral containers is an alpha feature in Kubernetes 1.22, so the official recommendation is not to use it in production environments.

#Configuration of Ephemeral Containers

Ephemeral containers share the same spec as regular containers. However, some fields are disabled, and some behaviors are changed. Some of the significant changes are listed below; check the ephemeral container spec for a complete list.

They are not to be restarted.
Resources definition is not allowed.
Ports are not allowed.
Startup, liveness, and readiness probes are not allowed.

#Enabling Ephemeral Containers in Your Cluster

As this feature is in the alpha state in Kubernetes versions 1.22 and older, it needs to be explicitly enabled using feature gates. If you are using Kubernetes 1.23 or newer, ephemeral containers are enabled by default, so you can skip to the next section.

Change to feature gates flag might not be allowed in most if not all managed cloud Kubernetes providers; please check with your provider.

First, let’s check if the ephemeral containers feature is enabled or not. To do that, run the following command.

$ kubectl debug -it <POD_NAME> --image=busybox

If the feature is not enabled, you will see a similar to the following message displayed.

Defaulting debug container name to debugger-wg54p.
error: ephemeral containers are disabled for this cluster (error from server: "the server could not find the requested resource").

Append EphemeralContainers=true to the feature gates flag --feature-gates= in the kubelet, kube-apiserver, kube-controller-manager, kube-proxy, kube-scheduler arguments.

Following is an example of the current flag definition:

...
--feature-gates=RemoveSelfLink=false
...

Now you need to add EphemeralContainers=true using ‘,’ as a separator.

...
--feature-gates=RemoveSelfLink=false,EphemeralContainers=true
...

Now you need to restart relevant services for changes to take effect.

For more information on feature gates and arguments, you can refer to the feature gates docs.

#Using Ephemeral Containers

Now that your cluster supports the Ephemeral Containers feature, let’s try it. To create ephemeral containers, you will use the debug subcommand of the kubectl command-line tool.

First, let’s create a deployment that we can use to simulate our application using nginx as an image.

$ kubectl create deployment nginx-deployment --image=nginx

The API server response should be successful.

deployment.apps/nginx-deployment created

Now you should get the pod name that you want to debug.

$ kubectl get pods

NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-66b6c48dd5-frsv9   1/1     Running   6          62d

The following command will create a new ephemeral container in the pod nginx-deployment-66b6c48dd5-frsv9. The ephemeral container’s image will be busybox. The -i and -t flags allow us to attach to the newly created container.

$ kubectl debug -it pods/nginx-deployment-66b6c48dd5-frsv9 --image=busybox

Defaulting debug container name to debugger-r44v5.
If you don't see a command prompt, try pressing enter.
/ #

Now we can quickly start debugging.

/ # ping 8.8.8.8

PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=112 time=9.797 ms
64 bytes from 8.8.8.8: seq=1 ttl=112 time=9.809 ms
^C

/ # nc --help

BusyBox v1.34.1 (2021-11-11 01:55:05 UTC) multi-call binary.

Usage: nc [OPTIONS] HOST PORT  - connect
nc [OPTIONS] -l -p PORT [HOST] [PORT]  - listen
...

When you use the kubectl describe pods <POD_NAME> command, you can see a new field, “Ephemeral Containers,” this section holds the ephemeral containers and their attributes.

$ kubectl describe pods nginx-deployment-66b6c48dd5-frsv9

Name:         nginx-deployment-66b6c48dd5-frsv9
Namespace:    default
Priority:     0
Node:         node1/x.x.x.x
Start Time:   Mon, 30 Aug 2021 21:50:17 +0200
Labels:       app=nginx
              pod-template-hash=66b6c48dd5
Annotations:  <none>
Status:       Running
IP:           10.0.0.110
IPs:
  IP:           10.0.0.110
Controlled By:  ReplicaSet/nginx-deployment-66b6c48dd5
Containers:
  nginx:
    Container ID:   containerd://6367af3713afb85ecb1e1a057ba9db4e3b2c48f39fee6a248cd2811e198001aa
    Image:          nginx:1.14.2
...
...
Ephemeral Containers:
  debugger-thwrn:
    Container ID:   containerd://eec23aa9ee63d96b82970bb947b29cbacc30685bbc3418ba840dee109f871bf0
    Image:          busybox
    Image ID:       docker.io/library/busybox@sha256:e7157b6d7ebbe2cce5eaa8cfe8aa4fa82d173999b9f90a9ec42e57323546c353
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 15 Nov 2021 20:28:57 +0100
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>

Process namespace sharing has been an excellent troubleshooting option, and this feature can be used with ephemeral containers. Process namespace sharing can not be applied to an existing pod, so a copy of the target pod must be created.

--share-processes flag enables process namespace sharing when used with --copy-to. These flags copy the existing pod spec definition into a new one with process namespace sharing enabled in the spec.

$ kubectl debug -it nginx-deployment-66b6c48dd5-frsv9 --image=busybox --share-processes --copy-to=debug-pod

Let’s run the ps command to see the running process.

/ # ps aux

As you can expect, you can see /pause from the busybox container and an nginx process from the nginx-deployment container.

PID   USER     TIME  COMMAND
    1 root      0:00 /pause
    6 root      0:00 nginx: master process nginx -g daemon off;
   11 101       0:00 nginx: worker process
   12 root      0:00 sh
   17 root      0:00 ps aux

With process namespace, sharing container filesystems are accessible too, which is very useful for debugging.

You can reach the container with /proc/<PID>/root link. From the above output, we know that nginx has PID 6.

# ls /proc/6/root/etc/nginx

Here we can see the Nginx directory structure and configuration files on the target container.

conf.d          koi-utf         mime.types      nginx.conf      uwsgi_params
fastcgi_params  koi-win         modules         scgi_params     win-utf

#Conclusion

The ephemeral containers feature certainly brings a lot of opportunities, and process namespace sharing allows advanced debugging capabilities. If you work with applications running in Kubernetes clusters, it would be worth your time to experiment with these features. It’s not hard to imagine some teams even automating workflows using these tools, like fixing other containers automatically when their readiness probes fail.