Table of Contents
Kubernetes Custom Resource Definitions (CRDs)
- What are Kubernetes CRDs?
- Tutorial: Extending Kubernetes with Custom Resource Definitions (CRDs)
- Challenges with Kubernetes CRDs in a Multi-tenant Environment
Kubernetes custom resource definitions (CRDs) let you add new object types to the Kubernetes API. Kubernetes comes with many different objects that represent the most common application components, such as pods, jobs, ConfigMaps, and secrets. But what if you want to express application-specific data, such as a DatabaseConnection or AuthToken, while preserving its structure and supporting custom behavior? This is where CRDs come in.
CRDs extend the API with support for arbitrary data types. Each CRD you create gets its own API endpoints that you can use to query, create, and edit instances of that resource. Custom resources are fully supported within kubectl, so you can run commands like kubectl get backgroundjobs
to interact with your application's objects.
In this article, you'll learn why CRDs are useful and how they relate to controller and operator extensions. Controllers are used to implement custom control loop mechanisms, such as crontabs and job queues, while operators are Kubernetes-specific middleware for individual apps like databases and observability stacks. Both depend heavily on CRDs.
After covering the theory, you'll also see how to register your own CRD and create object instances with kubectl.
Understanding Kubernetes Custom Resources
A custom resource is data stored in Kubernetes that doesn't match an object kind
included in the default distribution. You may have already used custom resources provided by popular community projects. For example, cert-manager automates SSL certificate management using Certificate and Issuer resources. Certificates represent real SSL certificates; you can obtain one by creating a CertificateRequest, another CRD provided by cert-manager.
You can use custom resources to encapsulate data required by your own applications, too. They store and retrieve structured data via dedicated API endpoints. Compared to generic solutions such as ConfigMaps, custom resources offer clearer intent, better separation of responsibilities, and an improved management experience when you're creating many instances of a particular data structure.
They're also the foundation for extending Kubernetes with your own controllers and operators.
Custom resources aren't the right choice for every scenario, though. For example, you don't need to create custom resources for arbitrary config values used by your app. In this situation, a plain ConfigMap will be easier to work with. Custom resources should be reserved for unique functionality that's scoped to the namespace or cluster level. They're ideal for data that fits the Kubernetes declarative operation model, requires its own API, and will be managed with ecosystem tools such as kubectl and the Kubernetes dashboard.
CRDs, Controllers, and Operators
Custom resources are usually encountered alongside controllers and operators. A Kubernetes controller monitors specific resource types and carries out actions that achieve desired state changes. The pod controller ensures containers are started in response to new pod manifests being added to your cluster, while cert-manager's controller obtains an SSL certificate when you create a CertificateRequest
object.
CRDs are rarely used without an accompanying controller. On their own, CRD instances are simple blobs of data in your cluster. The presence of custom objects used in this way is a good sign that a ConfigMap would be more appropriate for the situation.
Processing CRDs with Controllers
Kubernetes controllers are loops that take actions in response to specific events occurring. The controller's cycle has three main phases:
- Observe: The controller determines the cluster's desired state by monitoring for Kubernetes events that describe changes.
- Analyze: The observed state is compared to the known existing state. This uncovers discrepancies such as new objects that aren't in the old state or fields that have had their values updated.
- Act: The controller performs all the actions necessary to transition the cluster into the desired state.
Creating controllers for your CRDs lets you process their data and carry out tasks inside your cluster. Take the BackgroundJob CRD mentioned in the introduction: you could write a controller that automatically runs a command in a container whenever a new BackgroundJob
object is created.
You'd write a simple YAML manifest similar to this:
apiVersion: crds.example.com/v1
kind: BackgroundJob
metadata:
name: demo-job
spec:
image: busybox:latest
command: "echo hello-world"
Applying it to your cluster triggers the following cycle in the controller:
- Observe: The controller watches for Kubernetes events relating to
BackgroundJob
objects. - Analyze: The
demo-job
object doesn't appear in the cluster's current state. The controller establishes that it needs to run a new job to achieve the desired state. - Act: The controller starts a new pod running the
busybox:latest
image and executes the specified command. The cluster's actual state now matches the desired state you've declared.
Controllers extend Kubernetes with new behavior but retain the same monitor-act cycle used by its own resources. Objects including deployments, jobs, DaemonSets, and ReplicaSets are managed by controllers that work in this way, watching for events and then applying changes that create the new state.
CRDs and controllers let you implement your own higher-level resources that modify your cluster's state and implement particular behaviors. It's this characteristic that defines when CRDs should be used—if your data is only consumed within your application and isn't supposed to cause a change in your cluster's state, it can exist as plain config data in a ConfigMap instead.
Controllers and the Operator Pattern
Operators are application-specific Kubernetes extensions. They provide controllers and CRDs that automate tasks in your cluster, such as deploying apps and performing maintenance activities like backups and migrations. The documentation describes operators as extensions that seek to "capture the key aim of a human operator who is managing a service or set of services."
Take the example of a database server. This scenario can be difficult to configure in Kubernetes because you need to set up persistent volumes to store your data, StatefulSets to reliably replicate the database instance, and services to handle networking. These implementation details require Kubernetes-specific knowledge that takes you away from the "key aim" of deploying a functioning database.
Operators neatly address the problem by extending your cluster with custom behaviors that link controllers and CRDs. A database operator could provide a DatabaseConnection CRD that lets you supply familiar configuration parameters such as the database engine, schema, and user credentials. Adding a new DatabaseConnection object to your cluster would prompt the operator's controllers to create the persistent volumes, StatefulSets, and services required for your database deployment.
Operators distill Kubernetes-specific behavior back to application requirements. The DatabaseConnection operator and CRD permit you to deploy a database while knowing only its engine, schema, and user, without having to understand any Kubernetes concepts. They differ from plain controllers by possessing domain-specific knowledge that automates key tasks.
Implementing a Custom Resource Definition
Adding your own custom resources is easier than you might think. CRDs are created as CustomResourceDefinition
objects in a YAML manifest, just like other Kubernetes objects. A CRD's spec
declares the name it'll be exposed as in an API and the properties that the CRD instances will possess.
To follow along with this tutorial, you'll need kubectl installed with a functioning connection to a Kubernetes cluster.
To implement the DatabaseConnection resource discussed above, copy the following YAML and save it to a new file called dbcon.yaml
in your working directory:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databaseconnections.crds.example.com
spec:
group: crds.example.com
scope: Namespaced
names:
plural: databaseconnections
singular: databaseconnection
kind: DatabaseConnection
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
engine:
type: string
defaultSchema:
type: string
rootUser:
type: string
rootPassword:
type: string
You can also find the manifest in this article's GitHub repository.
The manifest defines a new resource type inside the crds.example.com
API group. There are a few details to note before you continue:
- The
spec.scope
field declares thatDatabaseConnection
objects will be scoped to namespaces. To create cluster-level resources, set this field toCluster
instead ofNamespaced
. - The resource's API names are set within
spec.names
. This affects the resource's API endpoints and kubectl commands, as well as the value of thekind
field when you create new object instances. The example lets you runkubectl get databaseconnection
andkubectl get databaseconnections
, as well as other similar commands, while usingkind: DatabaseConnection
in the YAML manifests of your object instances. - All Kubernetes object APIs are versioned, so you can introduce changes without breaking existing objects. A single version is defined for this CRD within its
versions
field. Theserved
field controls whether the version is currently exposed to clients, whilestorage: true
identifies the single version that is currently used for object storage. - The properties of
DatabaseConnection
objects are defined in OpenAPI v3 format within theschema
field. This manifest states thatDatabaseConnection
objects will haveengine
,defaultSchema
,rootUser
, androotPassword
fields.
Register your CRD with the Kubernetes API using kubectl apply
:
$ kubectl apply -f dbcon.yaml
customresourcedefinition.apiextensions.k8s.io/databaseconnections.crds.example.com created
Provisioning the new API endpoints for the resource can take a few minutes to complete. You can check progress by running kubectl describe
on your new CRD and inspecting the end of the output:
$ kubectl describe crd databaseconnections.crds.example.com
...
Status:
Accepted Names:
Kind: DatabaseConnection
List Kind: DatabaseConnectionList
Plural: databaseconnections
Singular: databaseconnection
Conditions:
Last Transition Time: 2022-11-14T16:02:17Z
Message: no conflicts found
Reason: NoConflicts
Status: True
Type: NamesAccepted
Last Transition Time: 2022-11-14T16:02:18Z
Message: the initial names have been accepted
Reason: InitialNamesAccepted
Status: True
Type: Established
Stored Versions:
v1
Events: <none>
Seeing Type: Established
under the Conditions
list means your CRD is ready to use. You can check that it's applied correctly by using kubectl to list matching object instances:
$ kubectl get databaseconnections
No resources found in default namespace.
There are no objects yet, but the resource type has been recognized. Trying to use an unregistered type results in an error:
$ kubectl get databaseconnections2
error: the server doesn't have a resource type "databaseconnections2"
Creating Objects Using Your CRD
You're now ready to create some objects using the resource type provided by your CRD. Copy the following YAML to demo-db.yaml
in your working directory:
apiVersion: crds.example.com/v1
kind: DatabaseConnection
metadata:
name: demo-database
spec:
engine: postgres
defaultSchema: demo-database
rootUser: root
rootPassword: pass
Within this code, you can specify the value of each field.
The apiVersion
is set to crds.example.com/v1
because the DatabaseConnection CRD was defined within the crds.example.com
API group. v1
indicates that the object's spec
uses schema version v1
, which was created earlier. Within the spec
field, you should set the properties included in the CRD's schema.
Use kubectl to add the object to your cluster:
$ kubectl apply -f demo-db.yaml
databaseconnection.crds.example.com/demo-database created
Repeat the kubectl get
command to confirm that the object has been created:
$ kubectl get databaseconnections
NAME AGE
demo-database 5m28s
Next, use the kubectl describe
command to view the demo-database
object's details:
$ kubectl describe databaseconnection demo-database
Name: demo-database
Namespace: default
Labels: <none>
Annotations: <none>
API Version: crds.example.com/v1
Kind: DatabaseConnection
Metadata:
Creation Timestamp: 2022-11-14T16:26:32Z
Generation: 1
...
Spec:
Default Schema: demo-database
Engine: postgres
Root Password: pass
Root User: root
Events: <none>
The properties set in the spec
are visible on the created object.
You've now successfully used a CRD to store your own structured data in your Kubernetes cluster. The API is managing DatabaseConnection
objects with the specialist schema you've defined.
These objects don't have any effect on your cluster's state on their own, however. In a real scenario, you'd need to package your DatabaseConnection CRD as part of an operator that also includes controllers to observe your objects and modify the state.
Applying a new DatabaseConnection
object should launch a database deployment for you. This happens because the operator's controllers watch for the apply event and will respond by creating resources in your cluster. The added resources allow the cluster to attain the new ideal state expressed by the DatabaseConnection
object.
A storage controller could provision persistent volumes, for example, while a separate replication controller initializes a StatefulSet to run a primary database node and multiple read-only replicas. Collectively, the controllers have application-specific knowledge that automates the database deployment task for you. This means they're adhering to the operator pattern.
The CRD acts as the frontend to this automated system. You need only create a DatabaseConnection
object to launch a fresh database server. If you weren't using CRDs, controllers, and the operator pattern, you'd have to manually assemble all the Kubernetes components, such as StatefulSets, volumes, services, and ConfigMaps, to bring up your containers each time.
Implementing this functionality is out of scope for this tutorial, but you can find detailed information on writing controllers and operators within the Kubernetes documentation.
CRD Schema Validation
CRDs support comprehensive schema validation controls to check whether your objects are valid. The DatabaseConnection example above enforces setting the engine
, defaultSchema
, and user account properties as strings, for example, but much more complicated rules are also supported using OpenAPI v3 validations.
Here's a more complex version of DatabaseConnection that adds a new replicaCount
field accepting values between 1
and 10
. It also marks all fields except replicaCount
as required and constrains engine
to only support mysql
and postgres
as its values. Save the manifest to dbcon-validated.yaml
in your working directory:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databaseconnections.crds.example.com
spec:
group: crds.example.com
scope: Namespaced
names:
plural: databaseconnections
singular: databaseconnection
kind: DatabaseConnection
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
engine:
type: string
enum:
- mysql
- postgres
replicaCount:
type: integer
minimum: 1
maximum: 10
defaultSchema:
type: string
rootUser:
type: string
rootPassword:
type: string
required:
- engine
- defaultSchema
- rootUser
- rootPassword
Apply the updated CRD to your cluster:
$ kubectl apply -f dbcon-validated.yaml
Next, save the following invalid DatabaseConnection to invalid-db.yaml
:
apiVersion: crds.example.com/v1
kind: DatabaseConnection
metadata:
name: demo-database
spec:
engine: redis
defaultSchema: demo-database
rootUser: root
rootPassword: pass
You'll see an error if you try to apply this manifest to your cluster:
$ kubectl apply -f invalid-db.yaml
The DatabaseConnection "demo-database" is invalid: spec.engine: Unsupported value: "redis": supported values: "mysql", "postgres"
The engine
field is set to redis
, which is unsupported by the CRD's schema. The validation constraints have prevented incorrect data from being added to your cluster.
Conclusion
Custom resource definitions (CRDs) are a mechanism for registering your own object types with the Kubernetes API. They'll appear as standalone endpoints in the API and in tools like kubectl. Controllers and operators use CRDs to extend Kubernetes with new behavior. A controller will observe your objects, analyze the changes compared to the cluster's current state, and apply actions that transition the cluster into the new desired state. Operators combine controllers and CRDs with domain-specific knowledge to automate key tasks inside your cluster.
Although CRDs, controllers, and operators facilitate powerful Kubernetes customizations, they have some limitations that make them unsuitable for certain situations. CRDs can be challenging to manage in multitenant environments, for example, because they apply to the entire cluster, not just individual namespaces. This compromises tenant isolation.
Loft mitigates this problem by providing self-service virtual clusters that operate fully independently of each other. CRDs deployed into one virtual cluster won't affect any others. Teams can work more efficiently using CRDs without causing knock-on effects on their neighbors. Loft's solution also supports multicloud, multicluster, SSO integration, and precise role-based access control, so you can create a productive Kubernetes platform while maintaining guardrails to prevent misuse.