Kubernetes CRDs = Huge Pain In Multi-Tenant Clusters

Volodymyr Grin

Kubernetes multi-tenancy provides a number of business and technical advantages over single-tenant clusters. However, multi-tenancy also brings several challenges and pain points with it, one of which is handling Kubernetes custom resource definitions (CRDs).

In this post, I will explain some of the biggest headaches typically experienced when dealing with CRDs in a multi-tenant Kubernetes environment as well as ways to minimize these issues.

#Custom Resource Definitions & Kubernetes Multi-Tenancy

Let us first go over some key ideas behind custom resource definitions (CRDs), multi-tenant Kubernetes clusters, and how CRDs are handled in a multi-tenant architecture.

#What are CRDs?

A custom resource is one way to extend the built-in objects provided by the Kubernetes API as well as to introduce your own API objects. For example, you can create an object that keeps track of data from specific events in your CI/CD pipeline by defining it as a custom resource which you can then use and manage like native Kubernetes objects, e.g. using commands such kubectl create or kubectl apply.

A custom resource definition (CRD) lets you define your custom object type and specify important attributes like the object’s name, group, versions, kind, scope, etc. The scope of a CRD, for example, determines whether the objects can be accessed on the cluster level or if they should be namespaced. The Kubernetes API server uses the CRD to create REST endpoints for managing these custom objects. In addition, the Kubernetes API server also handles the life cycle of resources defined in the CRD while the actual business logic for a custom resource is handled by a so-called Controller.

There’s more to CRDs than what I have summarized here, but the key thing to keep in mind is that CRDs provide a powerful way for you to extend Kubernetes and to add new capabilities that meet your use case in addition to what Kubernetes offers out of the box.

#What is Multi-Tenancy?

In Kubernetes, multi-tenancy is a system architecture where several users, applications, or workloads form so-called tenants and co-exist alongside each other in a shared cluster. In this model, different tenants share a cluster’s control plane and its resources. Multi-tenant Kubernetes clusters generally fall into two categories: soft multi-tenancy (with a weaker level of isolation among tenants, often within the same organization) and hard multi-tenancy (very strict tenant isolation, often including actors from different organizations who should not be aware of each other at all while operating in the same cluster).

Compared to managing multiple, individual single-tenant clusters, multi-tenancy reduces the overhead for cluster management and minimizes resource fragmentation. This especially makes sense as your organization reaches a point when the number of Kubernetes users has grown large enough that you begin seeing the cost and complexity of managing individual clusters become more unwieldy.

That being said, multi-tenancy does come with its own set of limitations and challenges, too. You need to implement the right level of isolation among tenants to minimize the impact of a compromised tenant on the rest of the users or workloads. You also need to ensure that compute, networking, and other resources are shared fairly across all tenants in the cluster so that each tenant gets access to what it needs to complete its tasks.

Multi-tenant clusters typically address these challenges with a combination of tools and best practices such as resource quotas, networking policies, pod security policies, RBAC, as well as more cutting-edge solutions like virtual clusters.

#CRDs in Multi-Tenant Clusters

Users and workloads in a multi-tenant cluster share cluster-wide components and resources like API server extensions, controllers, and custom resource definitions (CRDs). But handling the twin challenges of proper tenant isolation and fair resource allocation can significantly limit or even prevent tenants from making full use of CRDs. In many cases, tenants are not allowed to add or manage their own CRDs and RBAC rules which may be very limiting when installing new Helm charts or working on more complicated applications.

The reason for this limitation is that it is best practice in multi-tenant Kubernetes clusters to restrict individual users or workloads from accessing non-namespaced resources. These are resources whose scope isn’t limited to a specific namespace and are instead available at the cluster level. This is why one of the limitations of namespace-based multi-tenancy is that tenants are not able to manage their own CRDs which are, by design, cluster-wide objects.

Although the custom resources themselves may be namespaces, the CRD for these custom resources is cluster-wide. Although there have been attempts and hacks to created namespaced CRDs, so far there is no viable implementation for this available.

#Pain Points of Using CRDs in Multi-Tenant Clusters

Now that we’ve learned how CRDs work and why their usage can sometimes be difficult or even impossible in certain multi-tenant setups, let us take a look at a few specific issues that tend to spring up as a result of this.

#Isolation Issues & Security Vulnerabilities

Because of their cluster-wide scope, CRDs can cause problems regarding proper tenant isolation. Let’s say, your setup allows tenants to create and define their own CRDs to implement custom APIs. You now have to figure out a way to make sure that one user’s custom resource does not conflict with other resources used by someone else in the cluster.

Moreover, CRDs can potentially introduce security vulnerabilities in a multi-tenant cluster. In fact, CVE-2019-11247 vulnerability has been disclosed last year clearly pointing out a security weakness that could compromise multi-tenant clusters in a way that would allow users to access and update CRDs globally in a cluster when role-based access is enabled. Although this has been addressed in newer Kubernetes versions, it still highlights the added challenge of securely using CRDs in a multi-tenant setting.

Going back to the issue of making sure that tenants stay nicely isolated from each other, we will also see how some approaches for solving this lead to the next two pain points of using CRDs in multi-tenant Kubernetes clusters.

#Availability & Restrictions

As global resources, CRDs can cause issues across the entire cluster if they’re modified or changed in an incompatible way. That’s why, as we have seen earlier, admins of multi-tenant clusters typically limit tenants’ access to objects in the tenants’ namespaces. We have also already learned that this practice, in turn, prevents tenants from installing or managing their own CRDs which are non-namespaced resources.

But there are plenty of use cases where tenants need to access or even add new CRDs. In order for users and workloads to be able to use custom resource definitions.

#Additional Admin Overhead

Using multi-tenancy reduces the headache of having to manage individual Kubernetes clusters but it does not completely take away the burden of administering the shared, multi-tenant cluster. In fact, using CRDs in a multi-tenant setting adds some extra administrative responsibilities and challenges.

Assigning RBAC roles and permissions that allow tenants in a specific namespace to access custom resources can steadily get tedious as more CRDs are installed or created in the cluster.

Also, as the number of CRDs added to the cluster grows, it is going to become more likely that name clashes will start to occur. For internally-defined custom resources, this can be avoided by appending unique prefixes to the resource names. For third-party CRDs, however, this can be difficult to keep track of or enforced (e.g. different versions of cert-manager that are used by tenants may require different versions of the same CRDs).

#Reducing the Pain of CRDs in Multi-Tenant Kubernetes

As with most things in Kubernetes, using CRDs in a multi-tenant environment adds a unique set of technical challenges. But there are several approaches and tools you can use to make these difficulties more manageable.

#Don’t use CRDs if you can avoid them

First of all, consider alternatives to CRDs: One way to minimize the pain of CRDs in multi-tenant Kubernetes clusters is to avoid them altogether. The Kubernetes ecosystem offers up a number of alternative options you can use to potentially achieve the same things with CRDs. Even the official Kubernetes documentation provides a comparison table as a guide for choosing between creating a Kubernetes API aggregation vs. a stand-alone API outside of Kubernetes. This docs page also recommends using built-in Kubernetes objects whenever suitable and explains why, for example, simple ConfigMaps can solve a lot of problems that some folks may want to address with CRDs although there is often no real need for this level of sophistication.

So, think twice before even creating your own CRDs or installing someone else’s CRDs into your cluster. Sometimes, a simple ConfigMap may be good enough and will definitely be much easier to handle long-term.

#Consider Separate Clusters or Virtual Clusters

Another way to avoid CRD-related pain points in your multi-tenancy setup is to consider other deployment architectures. If your use case really requires true isolation, then running several single-tenant clusters might be the right option for you, especially if the number of clusters you need to manage is something you can handle. This will, however, be much more expensive and may have other downsides.

If you’re looking for the best of both worlds between single- and multi-tenant clusters, you should probably look into cutting-edge deployment options such as virtual Kubernetes clusters to help you avoid the limitations of pure namespace-based multi-tenancy.

#Don’t Re-Invent The Wheel - Build On Existing Solutions

Use best practices as a starting point: We have seen how some of the most widely-recommended best practices can lead to unwanted results, especially when applied without background and context. But that’s not to say best practices have zero value. Best practices and tips work well as starting points to refine your own implementation. After all, not every use case or deployment setup is the same.

Adopt a multi-tenancy implementation solution that works best for you: It can take a significant amount of time and effort to implement an effective multi-tenant Kubernetes deployment that really meets your requirements. This is where existing multi-tenancy solutions come in handy. These tools not only reduce the pain of using CRDs in a multi-tenant Kubernetes cluster but they also help you solve key challenges of multi-tenancy like user management, access control, tenant isolation, and fair resource allocation.

For example, Loft offers a complete range of solutions for multi-tenant Kubernetes platforms. Once installed to a Kubernetes cluster, Loft enables self-service creation of namespaces and virtual clusters, effectively solving the biggest pain points of using CRDs in multi-tenant clusters.

#Conclusion

Handling custom resource definitions (CRDs) can be a source of technical headaches in multi-tenant Kubernetes setups. The approaches we typically follow in solving the problems of a multi-tenant setup also tend to make CRDs hard to use in such environments.

Specifically, CRDs can lead to tenant isolation issues, potential security vulnerabilities, resource availability and restriction issues, and additional admin overhead. However, with the right multi-tenancy solution for your use case, these pain points become opportunities to help you build a more robust Kubernetes platform for your organization.

Photo by Karolina Grabowska from Pexels