Table of Contents
Aside from multi-tenancy, one of the popular benefits of vCluster is cost savings, as you can imagine, running several individual clusters for multiple teams can quickly rack up costs. In previous articles, we have covered cost optimization, however, you might wonder, “how much would I actually save?”.
In this article, we will visualize cluster costs using OpenCost to quantify and visualize the cost savings from using vCluster.
What is OpenCost?
OpenCost is an open-source tool for measuring and allocating cloud infrastructure and container costs in real time. What really sets OpenCost apart is its vendor neutrality and ability to calculate the actual cost of your workloads. For example, with it you can calculate workload costs and view the cost of existing workloads within your virtual cluster.
Virtual Clusters: Introducing vCluster
If you are unfamiliar with it, vCluster is an open-source tool for creating and managing virtual Kubernetes clusters. Virtual Kubernetes clusters provide true isolation and allow teams to run their workloads in a separate environment while sharing underlying resources.
Beyond multi-tenancy, many teams turn to vCluster for cost savings, being able to visualize costs in real time can help you understand where your expenses are coming from can help with capacity planning and cost optimization efforts.
Prerequisites
This article assumes some familiarity with AWS as well as Kubernetes. Additionally, you will need the following:
- kubectl
- The AWS CLI was installed and set up.
- eksctl installed and set up.
- vCluster CLI installed.
Creating an EKS Cluster
Begin by creating a new EKS cluster that will house your opencost deployments as well as virtual clusters. Run the following commands to create a new EKS cluster:
export cluster_name=vCluster-eks-demo 1 ✘
export region=us-east-1
eksctl create cluster --name $cluster_name --region $region
Output is similar to:
2024-10-03 07:37:03 [ℹ] eksctl version 0.189.0-dev+c9afc4260.2024-08-19T12:43:03Z
2024-10-03 07:37:03 [ℹ] using region us-east-1
2024-10-03 07:37:05 [ℹ] setting availability zones to [us-east-1f us-east-1a]
2024-10-03 07:37:05 [ℹ] subnets for us-east-1f - public:192.168.0.0/19 private:192.168.64.0/19
2024-10-03 07:37:05 [ℹ] subnets for us-east-1a - public:192.168.32.0/19 private:192.168.96.0/19
2024-10-03 07:37:05 [ℹ] nodegroup "ng-4866cef6" will use "" [AmazonLinux2/1.30]
2024-10-03 07:37:05 [ℹ] using Kubernetes version 1.30
2024-10-03 07:37:05 [ℹ] creating EKS cluster "vCluster-eks-demo" in "us-east-1" region with managed nodes
2024-10-03 07:37:05 [ℹ] will create 2 separate CloudFormation stacks for cluster itself and the initial managed nodegroup
2024-10-03 07:37:05 [ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-east-1 --cluster=vCluster-eks-demo'
2024-10-03 07:37:05 [ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "vCluster-eks-demo" in "us-east-1"
2024-10-03 07:37:05 [ℹ] CloudWatch logging will not be enabled for cluster "vCluster-eks-demo" in "us-east-1"
2024-10-03 07:37:05 [ℹ] you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=us-east-1 --cluster=vCluster-eks-demo'
2024-10-03 07:37:05 [ℹ] default addons vpc-cni, kube-proxy, coredns were not specified, will install them as EKS addons
2024-10-03 07:37:05 [ℹ]
2 sequential tasks: { create cluster control plane "vCluster-eks-demo",
2 sequential sub-tasks: {
2 sequential sub-tasks: {
1 task: { create addons },
wait for control plane to become ready,
},
create managed nodegroup "ng-4866cef6",
}
}
2024-10-03 07:37:05 [ℹ] building cluster stack "eksctl-vCluster-eks-demo-cluster"
Installing Prometheus
OpenCost depends on Prometheus to collect relevant metrics, which will be used to calculate the cost of your workloads. As such we will install that first, run the following command to install Prometheus using Helm:
helm install prometheus --repo <https://prometheus-community.github.io/helm-charts> prometheus --create-namespace \\
--namespace prometheus-system \\
--set prometheus-pushgateway.enabled=false \\
--set alertmanager.enabled=false \\
--set server.persistentVolume.enabled=false \\
-f <https://raw.githubusercontent.com/opencost/opencost/develop/kubernetes/prometheus/extraScrapeConfigs.yaml>
In the command above, we install Prometheus using the official chart and install extra scrape configs for opencost.
Output is similar to:
NAME: prometheus
LAST DEPLOYED: Thu Oct 2 07:37:50 2024
NAMESPACE: prometheus-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-server.prometheus-system.svc.cluster.local
Get the Prometheus server URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace prometheus-system -l "app.kubernetes.io/name=prometheus,app.kubernetes.io/instance=prometheus" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace prometheus-system port-forward $POD_NAME 9090
Install OpenCost
With Prometheus deployed, we can install OpenCost, run the following command:
helm install opencost --repo <https://opencost.github.io/opencost-helm-chart> opencost --create-namespace --namespace opencost
This will deploy opencost using helm and create the required namespace.
Output is similar to:
NAME: opencost
LAST DEPLOYED: Thu Oct 2 07:39:37 2024
NAMESPACE: opencost
STATUS: deployed
REVISION: 1
TEST SUITE: None
Now that we have OpenCost installed let's simulate a real-world scenario. Imagine our organization has two teams: the Payments and Security teams. Both teams require new clusters for their specific needs
For the Security team:
vcluster create security
Output is similar to:
07:51:05 info Creating namespace vcluster-security
07:51:06 info Create vcluster security...
07:51:06 info execute command: helm upgrade security /var/folders/gw/gd4m32rs5k5chjbf761w3zrw0000gn/T/vcluster-0.20.1.tgz-1247927701 --create-namespace --kubeconfig /var/folders/gw/gd4m32rs5k5chjbf761w3zrw0000gn/T/2781682868 --namespace vcluster-security --install --repository-config='' --values /var/folders/gw/gd4m32rs5k5chjbf761w3zrw0000gn/T/3242186727
07:51:20 done Successfully created virtual cluster security in namespace vcluster-security
For the Payments team:
vcluster create payments --namespace billing
Output is similar to:
07:46:20 info Creating namespace billing
07:46:22 info Create vcluster payments...
07:46:22 info execute command: helm upgrade payments /var/folders/gw/gd4m32rs5k5chjbf761w3zrw0000gn/T/vcluster-0.20.0.tgz-10175546 --create-namespace --kubeconfig /var/folders/gw/gd4m32rs5k5chjbf761w3zrw0000gn/T/35315025 --namespace billing --install --repository-config='' --values /var/folders/gw/gd4m32rs5k5chjbf761w3zrw0000gn/T/1381834732
07:46:36 done Successfully created virtual cluster payments in namespace billing
Now, let's say the payments team wants to test a new billing service. (We'll use Google's microservices demo to simulate this scenario). First, connect to the Payments vCluster and deploy the microservices:
vCluster connect payments
Output is similar to:
08:05:07 done vCluster is up and running
08:05:08 info Starting background proxy container...
kubectl apply -f <https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/refs/heads/main/release/kubernetes-manifests.yaml>
For the security team, let's say they want to test some new kubernetes network policies. To generate some data for their tests, we'll deploy a stress pod that consumes a predictable amount of resources.
vcluster connect security
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: stress-pod
labels:
app: test
app.kubernetes.io/name: test-pod
spec:
containers:
- name: stress
image: docker.io/polinux/stress-ng:latest
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: "50m"
memory: "50Mi"
command: ["stress-ng"]
args: ["--cpu", "1", "--vm", "1", "--vm-bytes", "250M", "--timeout", "7200"]
EOF
Using the manifest above will deploy the stress-ng pod into your cluster, using the command.args
you can specify how much CPU and memory you want the application to use and for how long its should run using the --timeout
flag.
Output is similar to:
pod/stress-pod created
At this point, you should be able to expose the OpenCost UI using:
vcluster disconnect
kubectl port-forward --namespace opencost service/opencost 9003 9090
Output is similar to:
Forwarding from 127.0.0.1:9003 -> 9003
Forwarding from [::1]:9003 -> 9003
Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from [::1]:9090 -> 9090
In a few seconds, the UI will be available on http://localhost:9090
Doing the Math
OpenCost makes calculating the cost of your workloads intuitive; the formula to keep in mind here is:
To generate some data for this post, leave the workloads running for slightly longer. Let's walk through it.
By default, OpenCost shows cost allocation by namespace for the last seven days. Looking at the data from our billing team's workloads, you can see that over the past week:
- The billing namespace has consumed $0.50 worth of CPU and $0.06 worth of RAM.
- The total cost for the billing namespace is $0.57, which represents 11.8% of the total cluster cost.
Let's plug these numbers into our formula to understand the daily cost:
With a good idea of the daily cost, we can estimate that this billing microservice will cost roughly $2.40 ($0.08 * 30) after thirty days. It's important to reiterate that we are just calculating this ourselves because we don't have that much data. OpenCost will show you this information directly as it accumulates more data over time.
You might have also noticed a significant portion of the costs attributed to an "idle" namespace. This represents $1.53 or 32.0% of the total costs, which is a substantial amount. The idle namespace in OpenCost refers to resources that have been allocated to the cluster but are not currently being used by any workloads.
In our case, $1.37 of CPU and $0.16 of RAM are idle.
Counting the Cost
Well we've done the math, but just how much are you saving?
Cost with vCluster
Assuming our two-node EKS cluster costs about $150 per month (a conservative estimate for small to medium instances), we can calculate our total costs:
- Base cluster cost: $150/month
- Billing workload cost: $2.40/month (as calculated earlier)
Total cost with vClusters: $150/month
Cost without vCluster
If we consider the alternative where we spin up separate clusters for the billing and security teams:
- Billing team cluster + workload costs: $152.40/month
- Security team cluster: $150/month
- Main cluster: $150/month
Total cost without vCluster: $450/month
While the demo in this in article is a bit contrived.
It’s not hard to see how teams with anywhere from a few to hundred of production applications can quickly visualize and understand what services in their applications are costing the most, where optimizations can be made and right size workloads appropriately.
Conclusion
Visualizing workload costs is essential for teams to understand the financial impact of their applications. In this post, we looked at how we can use OpenCost to visualize cost savings from using vCluster. Beyond providing strong isolation, vCluster allows tenants to access a full Kubernetes cluster without having to provision or manage a separate cluster, which can lead to huge cost savings, as you have just seen.
Looking to learn more about multi-tenancy? Checkout these blogs: