Kubernetes: Why Boring Clusters Are Better Clusters
The Kubernetes and cloud-native ecosystem has become very complex over time. The best illustration for this is probably the CNCF Landscape that currently contains about 1,500 cards. This huge amount of different cloud-native projects and tools has led some people to complain about the overwhelming complexity of the ecosystem recently, while others took it with humor and started to create memes about it.
However, there is not only the CNCF Landscape that can lead to some confusion, there are also CNCF sandbox, incubating, and graduated projects as well as many additional startups and other vendors developing solutions without being part of the CNCF.
Therefore, you have almost endless options for tools, add-ons, and configurations and I would expect that there will be even more in the future. All of this can even make it difficult to get easy things done right and to keep focus, which was jokingly summarized in a tweet by Kelsey Hightower:
“By 2025 it will take 64 CPUs and 1TB RAM to deploy a modern “Hello, World!” application.”
Still, I believe that only because there are many things you CAN do, this does not mean that you SHOULD do all of them. Often, it is better and sufficient to keep it simple, so you can focus on the things you actually need. For this, I will describe some advantages that simple Kubernetes cluster setups have compared to more elaborated ones.
The first advantage of a plain Kubernetes cluster is that it is easier to manage. This starts with the initial setup which is much faster and easier if you do not have to install many things.
However, it also continues during the normal operations of the cluster: If you have many tools and other things installed on a cluster, there also will be more upgrades for all these tools. In general, one could expect this effort to grow proportionally to the number of tools. Additionally, you also need to monitor the tools continuously. This does not only include monitoring that they are still working but also looking for critical upgrades, fixes, and potential bugs.
Finally, having a complex setup also means that there are many options for problems. For example, if you install one tool, it may require a certain configuration and then you need to make sure that this configuration change is not conflicting with any other tool and does not lead to any negative side-effect. The same goes for general Kubernetes version upgrades, which can be more complex if not every tool is compatible with it.
Therefore, you should be careful that the tools that are supposed to help you to manage and control your cluster will not become a burden instead of providing actual help.
One central advantage of Kubernetes is its scalability. Since it was originally made for this, “pure” and simple Kubernetes is already quite scalable. Of course, if you want to max out the full scalability of it, you may need some advanced Kubernetes configuration but for many standard scenarios, this may not be essential.
Here, it can be more of a problem if you have too many add-ons and a very complex configuration as the scalability is often associated with running many clusters. In this case, you either need to replicate your setup and install all tools on every cluster or you need to configure the tools in a way that they work across clusters. Both solutions can be challenging and may lead to more effort and thus cost than just adding another simple cluster.
Hyper-complex cluster setups can destroy part of the core value of Kubernetes.
#Smaller Attack Vector
If your system is simpler, it may be more secure because the attack vector is smaller. This is because every tool you install may contain bugs or other vulnerabilities that could be exploited maliciously. In contrast to this, a simple setup with just basic and stable Kubernetes features offers attackers fewer options to bring down or break into your system. (Of course, there are also many security tools and add-ons that actually make your system more secure and are worth installing if needed.)
While the risk of bugs in the software you use naturally increases with the amount of software, the probability that you miss an upgrade or security patch is also higher if you have to monitor a lot of tools for such upgrades and patches at the same time.
Running more tools and add-ons increases the attack surface of your system and thus the chance for vulnerabilities.
#Fewer Human Errors
Even if all the software you use is perfectly stable, the likelihood that you misconfigure something or that you miss an important update grows with the complexity of the Kubernetes cluster. It is just much more difficult for individuals to understand many tools, their configurations, and dependencies in-depth, especially if they are new or introduced at the same time.
Additionally, it is also much harder to trace an error and resolve an issue when there are endless options. Finally, also restoring a more complex system usually takes more time than restoring a simple system, which may be relevant in case of a problem.
The more things you do, the more likely it is that you make mistakes and do things you do not fully understand.
#What should you do now?
While I try to make a case for simple Kubernetes clusters in this article, that does mean that you should use absolutely plain clusters. There are certainly many tools and even highly advanced configurations that can be very useful. I also think that the CNCF Landscape is a great concept as it helps to get an easy overview of available solutions for different categories and also provides smaller niche providers and startups the opportunity to get visibility so that (hopefully) the best solution wins. (I am even involved in two open-source projects listed on the Landscape: Kubernetes multi-tenancy extension kiosk and cloud-native development tool DevSpace.)
So, what should you do in practice now?
I think the first step should be to evaluate what you really need before you start. This may even include the question if you need Kubernetes at all. Only when you know which problem you want to solve, you can start to look for potential solutions and at this point, the CNCF Landscape, Kubernetes blogs, and many other resources will be very helpful. That means that you should only look for solutions if you have a problem and do not start to use tools because they are currently hyped or sound interesting and start to look for a “problem” they solve later on. (Of course, it may make sense to implement solutions for problems that you expect to have but do not have yet (e.g. scalability) if they are rather difficult to implement later.)
In general, you should only do things and use tools that you understand. Here, it often pays off to take some time to evaluate and learn about them first and then implement them correctly from the beginning. During this process, you may even learn that the solution you thought was perfect is in fact not right for your use case and you may avoid potentially costly mistakes.
Another way to approach the complexity challenge is to relocate the complexity: If you have a multi-tenant Kubernetes system, not everything that one tenant needs is relevant for the others and should be installed or configured for the whole cluster. Instead, you can enable tenants to have independent setups by encapsulating tenants and workloads in virtual clusters. This allows for different and complex configurations of the individual virtual clusters while the underlying physical cluster can remain relatively basic and simple, which combines the described advantages of simplicity with the diverse needs of the tenants.
The cloud-native ecosystem is certainly one of the most vibrant and exciting tech communities, but it may also easily overwhelm people because so many things happen at the same time and a lot of new tools emerge at an impressive pace. While this progression is generally a good thing, you need to make sure that you don’t get lost.
Therefore, it is often better and sufficient to start simple and progress with things that you really need so you can run an easily manageable, scalable, secure, and stable system.