Terraform for Kubernetes: 10 Essentials You Need to Get Started

Lukonde Mwila
Minute Read

In recent years, Kubernetes has become the preferred choice for companies looking to run and manage large containerized workloads in an automated way.

Kubernetes is an open source solution that was built to address the inherent shortcomings of running containers at scale in areas such as deployments, scaling of workloads, managing the underlying infrastructure, load balancing, and other networking components.

As a system, Kubernetes consists of three main planes:

  • The control plane: This is the brain behind the entire operation.
  • The data plane: This is the memory of the whole system and stores all the configurations.
  • The worker plane: This is responsible for running the workloads.
  • Together, these planes make up a Kubernetes cluster and can either coexist on a single machine or device or be distributed across a number of platforms.

    Creating and administering Kubernetes clusters involves complex lifecycle management of the infrastructure. To optimize this lifecycle, you can make use of infrastructure as code (IaC). IaC enables teams to write and execute code that defines, updates, and destroys infrastructure. It presents a model that allows teams to treat all aspects of their architecture as software, including the hardware. Terraform is an open source, cloud-agnostic IaC provisioning tool that’s used by companies like Uber, Slack, Udemy, and Twitch.

    Both Terraform and Kubernetes support a declarative system of operation. A declarative model enables you to simply define or state the desired outcome of a system in order for it to be produced.

    The alternative approach is a procedural system that requires you to define all the intermediary steps to produce the desired outcome. An example of the latter would be bash scripting. This means both Terraform and Kubernetes automate the process of producing the desired state outcome based on what you declare and abstract away the steps in between.

    Using Terraform, you can automate the process of provisioning your Kubernetes cluster in a cloud environment. Some benefits of this approach include having a reliable and repeatable workflow once you have a stable version of your Terraform source code. In addition to this, the IaC is a self-documenting representation of the live state of your cluster. Lastly, it also supports having immutable infrastructure, which reduces the risks of configuration drift in your live environments.

    This article covers ten essentials that you’ll need to get started with Kubernetes and Terraform. The list will start with general prerequisites and work through tools that support best practices in operating at scale. Both Terraform and Kubernetes have various paradigms that support optimal implementations, and this article incorporates a few distinct models from the respective domains and amalgamates them to create a complementary workflow.

    1. Cloud Provider Accounts

    Terraform is a tool that integrates with your cloud provider account in order to provision the resources that you declare. First, you should create an account with the cloud provider with which you intend to provision your Kubernetes cluster. Most cloud providers offer a free tier for a limited amount of time or complimentary credits.

    Once you’ve created the account, you should download the cloud provider's CLI tool and configure it with your existing cloud profile. Below are examples of how to accomplish this with the Google Cloud Platform (GCP) and Microsoft Azure.

    Authenticate to GCP Using the CLI

    If you’re using Terraform on your workstation, you’ll need to install the Google Cloud SDK and authenticate using User Application Default Credentials by running the command gcloud auth application-default login.

    Authenticate to Azure Using the CLI

    After you install the Azure CLI, you can use the CLI to authenticate to your account. Run the following command and your default browser will open for you to sign in to Azure with a Microsoft account:

    az login
    

    You can then view account details with the following command:

    az account list
    

    This will output something similar to the following:

    [
      {
        "cloudName": "AzureCloud",
        "id": "00000000-0000-0000-0000-000000000000",
        "isDefault": true,
        "name": "PAYG Subscription",
        "state": "Enabled",
        "tenantId": "00000000-0000-0000-0000-000000000000",
        "user": {
          "name": "user@example.com",
          "type": "user"
        }
      }
    ]
    

    The id field is the subscription_id. The Subscription ID is a GUID that uniquely identifies your subscription to Azure services. If you have multiple subscriptions, you can set the one you want to use with the following command:

    az account set --subscription="SUBSCRIPTION_ID"
    

    2. The Terraform CLI Tool

    To use Terraform to provision infrastructure in the cloud, you need to download and install the CLI tool. The website offers options for downloading Terraform on different operating systems. Once you've completed the process, you can run terraform --version to ensure that the installation worked as expected.

    3. The Kubectl CLI Tool

    You’ll also need to use the official CLI tool, kubectl, to communicate with your upstream Kubernetes cluster in the cloud. Downloading and installing kubectl will generate the kube config file where all your cluster configuration contexts will be stored. When deploying Kubernetes resource manifests generated by Terraform, you’ll need to specify the kube config file path in your Kubernetes provider.

    4. A Remote State Management Strategy

    Terraform stores the state of your live infrastructure in a JSON configuration file. Whenever you execute Terraform commands, the CLI tool will use this configuration file as a source of truth or representation of the live state of the cluster before processing any new executions. By default, this JSON configuration file is generated and stored in the local directory of the Terraform source code that has been used to initialize the project.

    For individual projects, this method works well. However, if you’re working in a team, configuring a central cloud storage location for this state file is a more optimal strategy. A common implementation of this is storing the state file in an AWS S3 bucket and specifying the details of the bucket in your Terraform source code. An example of this can be seen below:

  • Create an S3 Bucket for State Backend
  • aws s3api create-bucket --bucket <bucket-name> --region <aws-region> --create-bucket-configuration LocationConstraint=<aws-region>

  • Create a File with State Backend Configuration
  • terraform {
      backend "s3" {
        bucket = "globally-unique-bucket-name"
        key = "globally-unique-bucket-name/terraform.tfstate"
        region = "aws-region"
        encrypt = true
      }
    }
    

    5. State Management Locking

    State locking is a measure used to safeguard against conflicting changes to the state of your infrastructure when multiple team members run a terraform apply command at the same time. In this approach, the state will be locked and only permit state changes from the first user to run the execution command. State locking is typically implemented with the use of an Amazon DynamoDB table. DynamoDB is a managed key-value store that supports strongly consistent reads and conditional writes.

    To use DynamoDB for locking with Terraform, you have to create a DynamoDB table with a primary key called LockID. An example of implementation can be seen below:

  • Create DynamoDB Table
  • aws dynamodb create-table --table-name <table-name> --attribute-definitions AttributeName=LockID,AttributeType=S --key-schema AttributeName=LockID,KeyType=HASH --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1

  • Specify DynamoDB Table In Backend State Configuration
  • terraform {
      backend "s3" {
        bucket = "globally-unique-bucket-name"
        key = "globally-unique-bucket-name/terraform.tfstate"
        dynamodb_table = "dynamodb-table-name"
        region = "aws-region"
        encrypt = true
      }
    }
    

    6. The Cloud Provider API for Terraform

    To provision resources in a public cloud environment, you need to specify the API for the relevant cloud provider. In addition to this, you can optionally configure the account profile that Terraform should use. Alternatively, Terraform will use the default profile configured with your CLI tool. An example of how to execute this with an AWS account is demonstrated below:

    provider "aws" {
      region = "aws-region"
      profile = "aws-profile"
    }
    
    terraform {
      required_providers {
        aws = {
          source = "hashicorp/aws"
          version = "~> 3.0"
        }
      }
    
      backend "s3" {
        ...
      }
    }
    

    Once you’ve done this, you need to declare the relevant cloud resources to be created for your desired hosted cluster, whether it's Amazon EKS, Azure AKS, or GCP GKE.

    The resources you declare for the creation of the cluster will depend on the cloud provider's Terraform API. Some cloud providers require more resources to be specified in comparison to others. Below is an example of some resource definitions for the creation of an Amazon EKS cluster:

    resource "aws_eks_cluster" "main" {
      name     = var.eks_cluster_name
      role_arn = aws_iam_role.eks_cluster.arn
    
      vpc_config {
        security_group_ids      = [aws_security_group.eks_cluster.id, aws_security_group.eks_nodes.id]
        endpoint_private_access = var.endpoint_private_access
        endpoint_public_access  = var.endpoint_public_access
        subnet_ids = var.eks_cluster_subnet_ids
      }
    
      depends_on = [
        aws_iam_role_policy_attachment.aws_eks_cluster_policy
      ]
    }
    
    resource "aws_eks_node_group" "main" {
      cluster_name    = aws_eks_cluster.main.name
      node_group_name = var.node_group_name
      node_role_arn   = aws_iam_role.eks_nodes.arn
      subnet_ids      = var.private_subnet_ids
    
      ami_type       = var.ami_type
      disk_size      = var.disk_size
      instance_types = var.instance_types
    
      scaling_config {
        desired_size = var.pvt_desired_size
        max_size     = var.pvt_max_size
        min_size     = var.pvt_min_size
      }
    
      tags = {
        Name = var.node_group_name
      }
    
      depends_on = [
        aws_iam_role_policy_attachment.aws_eks_worker_node_policy,
        aws_iam_role_policy_attachment.aws_eks_cni_policy,
        aws_iam_role_policy_attachment.ec2_read_only,
      ]
    }
    

    The complete Terraform source code for the above snippets can be found in this GitHub repository.

    7. The Kubernetes Provider

    Cluster provisioning is just the first step in your Kubernetes lifecycle management with Terraform. This should be followed up by deploying Kubernetes resources that represent the container workloads and cluster configurations of your cluster.

    To do this with Terraform, you can make use of the Kubernetes provider. This will allow you to automatically generate Kubernetes resources using Terraform source code. In order for this to work, you need to ensure that your Terraform project is configured with your Kubernetes cluster configuration:

    provider "kubernetes" {
      config_path    = "~/.kube/config"
      config_context = "arn:aws:eks:eu-west-1:<aws-account-id>:cluster/my-cluster"
    }
    
    resource "kubernetes_namespace" "dev" {
      metadata {
        name = "my-dev-namespace"
      }
    }
    

    8. The Helm Provider

    Deploying raw manifests to a Kubernetes cluster generally works well at a small scale. However, as you start to manage a number of different resources (Deployments, Services, ConfigMaps, Secrets, Jobs, ServiceAccounts, Roles, RoleBindings, etc.), this becomes especially challenging for application installations, management, and versioning.

    A common practice is to make use of Helm, which is a package manager for Kubernetes resources that simplifies the process of managing, installing, and versioning your resource manifests at scale. The Terraform Helm Provider enables you to generate and execute Helm Chart releases to your cluster:

    provider "helm" {
      kubernetes {
        host                   = data.aws_eks_cluster.cluster.endpoint
        cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
        exec {
          api_version = "client.authentication.k8s.io/v1alpha1"
          args        = ["eks", "get-token", "--cluster-name", data.aws_eks_cluster.cluster.name]
          command     = "aws"
        }
      }
    }
    
    resource "helm_release" "application_name" {
      name       = "application-name"
      repository = "https://my-application-charts"
      chart      = "application-chart"
    
      values = [
        file("${path.module}/values.yaml")
      ]
    }
    

    9. The Kustomize Provider

    When managing releases to your Kubernetes clusters, either with Helm or raw manifests, you'll be required to make modifications to these files. This may be relatively straightforward for a local cluster. However, when you're trying to adopt an optimal and scalable approach to building and delivering your resources, you’ll need a tool that helps you automatically manage these modifications and configurations.

    Kustomize is a configuration management tool that you can use for resource templating. For example, if you're building and pushing a new container image in your CI/CD pipeline, you can make use of Kustomize to update the relevant YAML resources to reflect that change in your pipeline process.

    Kustomize has a Terraform provider that you can use for your templating and configuration management purposes. Similar to the Helm and Kubernetes providers, you would have to specify the path to your kube config file and optionally set the cluster context.

    10. The Cluster Management Tool

    Lastly, another important aspect of the Kubernetes cluster management lifecycle is having a tool that will simplify the process of managing your cluster, as well as offer in-depth visibility. For such a task, you can make use of Rancher, which is an open source Kubernetes cluster management platform. Rancher also has a Terraform provider that you can use. It has a sleek UI with a number of rich features that support the management of your downstream Kubernetes clusters.

    Conclusion

    In this article, you learned how to use Terraform to manage your Kubernetes clusters, from the underlying infrastructure to the containerized workloads. As highlighted above, the lifecycle of managing, configuring, and optimizing Kubernetes clusters is complex and arduous. Following a manual approach to managing the infrastructure is a recipe for misconfiguration and inefficient developer workflows.

    Using an IaC tool like Terraform brings several benefits into the picture by treating all aspects of the infrastructure as software. This article covered ten essential elements that will help you successfully adopt a model of managing Kubernetes with Terraform.

    Photo by Christina @ wocintechchat.com on Unsplash

    Sign up for our newsletter

    Be the first to know about new features, announcements and industry insights.