Table of Contents
In recent years, Kubernetes has become the preferred choice for companies looking to run and manage large containerized workloads in an automated way.
Kubernetes is an open source solution that was built to address the inherent shortcomings of running containers at scale in areas such as deployments, scaling of workloads, managing the underlying infrastructure, load balancing, and other networking components.
As a system, Kubernetes consists of three main planes:
- The control plane: This is the brain behind the entire operation.
- The data plane: This is the memory of the whole system and stores all the configurations.
- The worker plane: This is responsible for running the workloads.
Together, these planes make up a Kubernetes cluster and can either coexist on a single machine or device or be distributed across a number of platforms.
Creating and administering Kubernetes clusters involves complex lifecycle management of the infrastructure. To optimize this lifecycle, you can make use of infrastructure as code (IaC). IaC enables teams to write and execute code that defines, updates, and destroys infrastructure. It presents a model that allows teams to treat all aspects of their architecture as software, including the hardware. Terraform is an open source, cloud-agnostic IaC provisioning tool that’s used by companies like Uber, Slack, Udemy, and Twitch.
Both Terraform and Kubernetes support a declarative system of operation. A declarative model enables you to simply define or state the desired outcome of a system in order for it to be produced.
The alternative approach is a procedural system that requires you to define all the intermediary steps to produce the desired outcome. An example of the latter would be bash scripting. This means both Terraform and Kubernetes automate the process of producing the desired state outcome based on what you declare and abstract away the steps in between.
Using Terraform, you can automate the process of provisioning your Kubernetes cluster in a cloud environment. Some benefits of this approach include having a reliable and repeatable workflow once you have a stable version of your Terraform source code. In addition to this, the IaC is a self-documenting representation of the live state of your cluster. Lastly, it also supports having immutable infrastructure, which reduces the risks of configuration drift in your live environments.
This article covers ten essentials that you’ll need to get started with Kubernetes and Terraform. The list will start with general prerequisites and work through tools that support best practices in operating at scale. Both Terraform and Kubernetes have various paradigms that support optimal implementations, and this article incorporates a few distinct models from the respective domains and amalgamates them to create a complementary workflow.
1. Cloud Provider Accounts
Terraform is a tool that integrates with your cloud provider account in order to provision the resources that you declare. First, you should create an account with the cloud provider with which you intend to provision your Kubernetes cluster. Most cloud providers offer a free tier for a limited amount of time or complimentary credits.
Once you’ve created the account, you should download the cloud provider's CLI tool and configure it with your existing cloud profile. Below are examples of how to accomplish this with the Google Cloud Platform (GCP) and Microsoft Azure.
Authenticate to GCP Using the CLI
If you’re using Terraform on your workstation, you’ll need to install the Google Cloud SDK and authenticate using User Application Default Credentials
by running the command gcloud auth application-default login
.
Authenticate to Azure Using the CLI
After you install the Azure CLI, you can use the CLI to authenticate to your account. Run the following command and your default browser will open for you to sign in to Azure with a Microsoft account:
az login
You can then view account details with the following command:
az account list
This will output something similar to the following:
[
{
"cloudName": "AzureCloud",
"id": "00000000-0000-0000-0000-000000000000",
"isDefault": true,
"name": "PAYG Subscription",
"state": "Enabled",
"tenantId": "00000000-0000-0000-0000-000000000000",
"user": {
"name": "user@example.com",
"type": "user"
}
}
]
The id
field is the subscription_id
. The Subscription ID is a GUID that uniquely identifies your subscription to Azure services. If you have multiple subscriptions, you can set the one you want to use with the following command:
az account set --subscription="SUBSCRIPTION_ID"
2. The Terraform CLI Tool
To use Terraform to provision infrastructure in the cloud, you need to download and install the CLI tool. The website offers options for downloading Terraform on different operating systems. Once you've completed the process, you can run terraform --version
to ensure that the installation worked as expected.
3. The Kubectl CLI Tool
You’ll also need to use the official CLI tool, kubectl
, to communicate with your upstream Kubernetes cluster in the cloud. Downloading and installing kubectl
will generate the kube config file where all your cluster configuration contexts will be stored. When deploying Kubernetes resource manifests generated by Terraform, you’ll need to specify the kube config file path in your Kubernetes provider.
4. A Remote State Management Strategy
Terraform stores the state of your live infrastructure in a JSON configuration file. Whenever you execute Terraform commands, the CLI tool will use this configuration file as a source of truth or representation of the live state of the cluster before processing any new executions. By default, this JSON configuration file is generated and stored in the local directory of the Terraform source code that has been used to initialize the project.
For individual projects, this method works well. However, if you’re working in a team, configuring a central cloud storage location for this state file is a more optimal strategy. A common implementation of this is storing the state file in an AWS S3 bucket and specifying the details of the bucket in your Terraform source code. An example of this can be seen below:
- Create an S3 Bucket for State Backend
aws s3api create-bucket --bucket <bucket-name> --region <aws-region> --create-bucket-configuration LocationConstraint=<aws-region>
- Create a File with State Backend Configuration
terraform {
backend "s3" {
bucket = "globally-unique-bucket-name"
key = "globally-unique-bucket-name/terraform.tfstate"
region = "aws-region"
encrypt = true
}
}
5. State Management Locking
State locking is a measure used to safeguard against conflicting changes to the state of your infrastructure when multiple team members run a terraform apply
command at the same time. In this approach, the state will be locked and only permit state changes from the first user to run the execution command. State locking is typically implemented with the use of an Amazon DynamoDB table. DynamoDB is a managed key-value store that supports strongly consistent reads and conditional writes.
To use DynamoDB for locking with Terraform, you have to create a DynamoDB table with a primary key called LockID
. An example of implementation can be seen below:
- Create DynamoDB Table
aws dynamodb create-table --table-name <table-name> --attribute-definitions AttributeName=LockID,AttributeType=S --key-schema AttributeName=LockID,KeyType=HASH --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1
- Specify DynamoDB Table In Backend State Configuration
terraform {
backend "s3" {
bucket = "globally-unique-bucket-name"
key = "globally-unique-bucket-name/terraform.tfstate"
dynamodb_table = "dynamodb-table-name"
region = "aws-region"
encrypt = true
}
}
6. The Cloud Provider API for Terraform
To provision resources in a public cloud environment, you need to specify the API for the relevant cloud provider. In addition to this, you can optionally configure the account profile that Terraform should use. Alternatively, Terraform will use the default profile configured with your CLI tool. An example of how to execute this with an AWS account is demonstrated below:
provider "aws" {
region = "aws-region"
profile = "aws-profile"
}
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
}
}
backend "s3" {
...
}
}
Once you’ve done this, you need to declare the relevant cloud resources to be created for your desired hosted cluster, whether it's Amazon EKS, Azure AKS, or GCP GKE.
The resources you declare for the creation of the cluster will depend on the cloud provider's Terraform API. Some cloud providers require more resources to be specified in comparison to others. Below is an example of some resource definitions for the creation of an Amazon EKS cluster:
resource "aws_eks_cluster" "main" {
name = var.eks_cluster_name
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
security_group_ids = [aws_security_group.eks_cluster.id, aws_security_group.eks_nodes.id]
endpoint_private_access = var.endpoint_private_access
endpoint_public_access = var.endpoint_public_access
subnet_ids = var.eks_cluster_subnet_ids
}
depends_on = [
aws_iam_role_policy_attachment.aws_eks_cluster_policy
]
}
resource "aws_eks_node_group" "main" {
cluster_name = aws_eks_cluster.main.name
node_group_name = var.node_group_name
node_role_arn = aws_iam_role.eks_nodes.arn
subnet_ids = var.private_subnet_ids
ami_type = var.ami_type
disk_size = var.disk_size
instance_types = var.instance_types
scaling_config {
desired_size = var.pvt_desired_size
max_size = var.pvt_max_size
min_size = var.pvt_min_size
}
tags = {
Name = var.node_group_name
}
depends_on = [
aws_iam_role_policy_attachment.aws_eks_worker_node_policy,
aws_iam_role_policy_attachment.aws_eks_cni_policy,
aws_iam_role_policy_attachment.ec2_read_only,
]
}
The complete Terraform source code for the above snippets can be found in this GitHub repository.
7. The Kubernetes Provider
Cluster provisioning is just the first step in your Kubernetes lifecycle management with Terraform. This should be followed up by deploying Kubernetes resources that represent the container workloads and cluster configurations of your cluster.
To do this with Terraform, you can make use of the Kubernetes provider. This will allow you to automatically generate Kubernetes resources using Terraform source code. In order for this to work, you need to ensure that your Terraform project is configured with your Kubernetes cluster configuration:
provider "kubernetes" {
config_path = "~/.kube/config"
config_context = "arn:aws:eks:eu-west-1:<aws-account-id>:cluster/my-cluster"
}
resource "kubernetes_namespace" "dev" {
metadata {
name = "my-dev-namespace"
}
}
8. The Helm Provider
Deploying raw manifests to a Kubernetes cluster generally works well at a small scale. However, as you start to manage a number of different resources (Deployments, Services, ConfigMaps, Secrets, Jobs, ServiceAccounts, Roles, RoleBindings, etc.), this becomes especially challenging for application installations, management, and versioning.
A common practice is to make use of Helm, which is a package manager for Kubernetes resources that simplifies the process of managing, installing, and versioning your resource manifests at scale. The Terraform Helm Provider enables you to generate and execute Helm Chart releases to your cluster:
provider "helm" {
kubernetes {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
exec {
api_version = "client.authentication.k8s.io/v1alpha1"
args = ["eks", "get-token", "--cluster-name", data.aws_eks_cluster.cluster.name]
command = "aws"
}
}
}
resource "helm_release" "application_name" {
name = "application-name"
repository = "https://my-application-charts"
chart = "application-chart"
values = [
file("${path.module}/values.yaml")
]
}
9. The Kustomize Provider
When managing releases to your Kubernetes clusters, either with Helm or raw manifests, you'll be required to make modifications to these files. This may be relatively straightforward for a local cluster. However, when you're trying to adopt an optimal and scalable approach to building and delivering your resources, you’ll need a tool that helps you automatically manage these modifications and configurations.
Kustomize is a configuration management tool that you can use for resource templating. For example, if you're building and pushing a new container image in your CI/CD pipeline, you can make use of Kustomize to update the relevant YAML resources to reflect that change in your pipeline process.
Kustomize has a Terraform provider that you can use for your templating and configuration management purposes. Similar to the Helm and Kubernetes providers, you would have to specify the path to your kube config file and optionally set the cluster context.
10. The Cluster Management Tool
Lastly, another important aspect of the Kubernetes cluster management lifecycle is having a tool that will simplify the process of managing your cluster, as well as offer in-depth visibility. For such a task, you can make use of Rancher, which is an open source Kubernetes cluster management platform. Rancher also has a Terraform provider that you can use. It has a sleek UI with a number of rich features that support the management of your downstream Kubernetes clusters.
Conclusion
In this article, you learned how to use Terraform to manage your Kubernetes clusters, from the underlying infrastructure to the containerized workloads. As highlighted above, the lifecycle of managing, configuring, and optimizing Kubernetes clusters is complex and arduous. Following a manual approach to managing the infrastructure is a recipe for misconfiguration and inefficient developer workflows.
Using an IaC tool like Terraform brings several benefits into the picture by treating all aspects of the infrastructure as software. This article covered ten essential elements that will help you successfully adopt a model of managing Kubernetes with Terraform.
Photo by Christina @ wocintechchat.com on Unsplash