Table of Contents
vcluster Series
- Introduction to Virtual Clusters in Kubernetes
- Kubernetes Namespaces vs. Virtual Clusters
- vcluster Hands-on Tutorial
- High Availability with vcluster
- Virtual Clusters For Kubernetes - Benefits & Use Cases
- Development Environments with vcluster
- How Virtual Kubernetes Clusters Can Speed Up Your Local Development
- Using Virtual Clusters for Development and CI/CD Workflows
- Kubernetes: Virtual Clusters For CI/CD & Testing
- How Codefresh Uses vcluster to Provide Hosted Argo CD
At Solo.io, we listen to the community and try out the best technologies to help teams meet their goals. This includes both working on open source projects, as well as providing support and products that can help you better leverage technologies.
Gloo Mesh is one of those products. It provides a good example of how to reduce the complexity of managing the entire application networking in your infrastructure to a minimum. As can be understood, this implies multi-cluster architectures.
In such scenarios, how can you verify that a multi-cluster configuration is correct in a local environment before moving to a more extensive environment?
Let’s put it in context. Your team (or the development team) wants to release a new feature. They want to cause some chaos in the system. Gloo Mesh offers this functionality and many other through policies (FailOver, fault injection, outlier detection, retries, timeouts, mirroring, rate limiting, and more). But you, as an operator of the platform and Gloo Mesh, may not be sure which is the correct configuration. You need to investigate first in a development or testing environment.
In a simulated production scenario that uses three clusters (one for management and two for workloads), the first concern is obvious: Cost. Deploying three clusters in public clouds is expensive.
The second concern: networking. Let’s say you decide to investigate first in your local environment. Deploying three entire clusters in your own workstation is not easy. You can opt for solutions like multiple kind (kubernetes-in-docker) or k3d. Both deploy clusters in containers on top of the host machine. One cluster, one container. If you try one of these approaches, you probably have to tweak the network between the containers and the host machine.
The third concern: CPU. To deploy things in your own local environment, you need to make sure you have enough “muscle”.
Now… What if we start considering “A cluster within a cluster”?
vcluster
I hope you saw the iconic movie Inception. I enjoyed it a lot and I watch it again from time to time. The idea was pretty catchy: “A dream within a dream”.
Virtualization technology follows the same idea. If you are familiar with Docker, years ago there was the need for docker-in-docker. Nowadays it is a very common approach in CI/CD pipelines. Say for example that tasks are running in a container but you need to test an application already embedded in another container. This would be a use case of docker-in-docker.
Given that idea, what stops us from trying cluster-in-cluster? This is where vcluster comes in to offer some benefits. vcluster allows you to create and manage virtual Kubernetes clusters. A virtual cluster is basically a control plane that runs in a namespace on a shared host custer. Here a visualization:
In the picture we can see that Gloo Mesh, which before required three clusters to simulate a production-ready environment, now just needs one cluster with three virtual clusters.
Quick benefits:
- Cost effective: Now, your cost is only one cluster. It is true that it needs to be bigger than before, but you’re saving money by deploying one cluster instead of three.
- Time-saving: when you work in your local environment, you do not want to spend time creating new clusters. If you use kind, it can take several minutes to get three new clusters. With vcluster, you can get your three new clusters in about 20 seconds.
Let’s prove all this in a workshop.
Hands on!
In this workshop, in a matter of seconds, you will deploy Istio in the two workload clusters, a demo application to use in your labs, and Gloo Mesh to test the application networking capabilities (multi-cluster traffic, traffic splitting, fault injection, etc.). All this is based on just one host Kubernetes cluster containing three virtual clusters.
Your architecture will look like this:
Prerequisites
- A Kubernetes cluster which will be the host cluster (kind, k3s, k0s, etc.)
- vcluster CLI. This has been tested with version 0.10.2
- Helm v3
- Kubectl
- meshctl
Getting Started
Let’s check on how long it takes you to deploy everything. The test was made using a virtual machine with only three CPUs. Therefore, you will also deploy components with minimum resources.
You start with setting up some environment variables:
# Context name for the host cluster
export MAIN_CONTEXT=$(kubectl config current-context)
# Context names for the gloo mesh clusters (vclusters)
export MGMT_CLUSTER=devmgmt
export CLUSTER_1=devcluster1
export CLUSTER_2=devcluster2
Install environments
First, let’s create management cluster:
cat << EOF > vcluster-values.yaml
isolation:
enabled: false
limitRange:
enabled: false
podSecurityStandard: privileged
resourceQuota:
enabled: false
rbac:
clusterRole:
create: true
syncer:
resources:
limits:
cpu: 100m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
extraArgs:
- --fake-nodes=false
- --sync-all-nodes
vcluster:
resources:
limits:
cpu: 200m
memory: 2Gi
requests:
cpu: 100m
memory: 256Mi
extraArgs:
- --kubelet-arg=allowed-unsafe-sysctls=net.ipv4.*
- --kube-apiserver-arg=feature-gates=EphemeralContainers=true
- --kube-scheduler-arg=feature-gates=EphemeralContainers=true
- --kubelet-arg=feature-gates=EphemeralContainers=true
image: rancher/k3s:v1.22.5-k3s1
EOF
vcluster create $MGMT_CLUSTER -n $MGMT_CLUSTER --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT
vcluster connect $MGMT_CLUSTER -n $MGMT_CLUSTER --kube-config-context-name $MGMT_CLUSTER --update-current --context $MAIN_CONTEXT
kubectl --context $MGMT_CLUSTER get namespaces
Next, the workload cluster 1:
vcluster create $CLUSTER_1 -n $CLUSTER_1 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT
vcluster connect $CLUSTER_1 -n $CLUSTER_1 --kube-config-context-name $CLUSTER_1 --update-current --context $MAIN_CONTEXT
kubectl --context $CLUSTER_1 get namespaces
And finally, the workload cluster 2:
vcluster create $CLUSTER_2 -n $CLUSTER_2 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT
vcluster connect $CLUSTER_2 -n $CLUSTER_2 --kube-config-context-name $CLUSTER_2 --update-current --context $MAIN_CONTEXT
kubectl --context $CLUSTER_2 get namespaces
This is it! Three clusters in around 20 seconds. If you’re interested to know more, at the end of this post, you can find a more in-depth explanation of what you have deployed with vcluster and some tips to remember.
Now, time for Istio to be deployed in the workload clusters:
Install Gloo Mesh
You will need a license key:
export GLOO_MESH_LICENSE_KEY=<license_key>
And you need to define the Gloo Mesh version:
export GLOO_MESH_VERSION=2.0.9
Gloo Mesh can be installed through Helm charts. However, to not overflow this post with code, you will use the meshctl CLI:
meshctl install --kubecontext $MGMT_CLUSTER --license $GLOO_MESH_LICENSE_KEY --version $GLOO_MESH_VERSION
Verify all pods are running:
kubectl get pods -n gloo-mesh --context $MGMT_CLUSTER
And you will see something like:
NAME READY STATUS RESTARTS AGE
gloo-mesh-mgmt-server-778d45c7b5-5d9nh 1/1 Running 0 41s
gloo-mesh-redis-844dc4f9-jnb4j 1/1 Running 0 41s
gloo-mesh-ui-749dc7875c-4z77k 3/3 Running 0 41s
prometheus-server-86854b778-r6r52 2/2 Running 0 41s
Register workload clusters
Gloo Mesh relies on an agent-based approach. Therefore, when registering a workload cluster, you will need to tell the agent how to communicate with the management server.
Note that in EKS the service does not return an IP, but an Address. Please make that adjustment the following commands if you're using EKS.
MGMT_SERVER_NETWORKING_DOMAIN=$(kubectl get svc -n gloo-mesh gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
MGMT_SERVER_NETWORKING_PORT=$(kubectl -n gloo-mesh get service gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.spec.ports[?(@.name=="grpc")].port}')
MGMT_SERVER_NETWORKING_ADDRESS=${MGMT_SERVER_NETWORKING_DOMAIN}:${MGMT_SERVER_NETWORKING_PORT}
echo $MGMT_SERVER_NETWORKING_ADDRESS
Register the workload cluster. This will deploy the agent as well:
meshctl cluster register \
--remote-context=$CLUSTER_1 \
--relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \
--kubecontext $MGMT_CLUSTER \
$CLUSTER_1
And you will see:
Registering cluster
📃 Copying root CA relay-root-tls-secret.gloo-mesh to remote cluster from management cluster
📃 Copying bootstrap token relay-identity-token-secret.gloo-mesh to remote cluster from management cluster
💻 Installing relay agent in the remote cluster
Finished installing chart 'gloo-mesh-agent' as release gloo-mesh:gloo-mesh-agent
📃 Creating remote.cluster KubernetesCluster CRD in management cluster
⌚ Waiting for relay agent to have a client certificate
Checking...
Checking...
🗑 Removing bootstrap token
✅ Done registering cluster!
Register second workload cluster:
meshctl cluster register \
--remote-context=$CLUSTER_2 \
--relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \
--kubecontext $MGMT_CLUSTER \
$CLUSTER_2
#### Check that the resource is created in management:
```Bash
kubectl get kubernetescluster -n gloo-mesh --context $MGMT_CLUSTER
And you will see:
NAME AGE
devcluster1 27s
devcluster2 23s
Install Istio
Istio by default requires some resources. In your local environment, you might not have the resources to deploy three clusters fully functional and two Istio service meshes. Therefore, we need to reduce the required resources for Istio. That’s fine as this is just a development environment.
NOTE: This post is using Istio v1.12.6:
export ISTIO_VERSION=1.12.6
Install Istio’s CRDs:
# Install Istio CRDS cluster1
helm upgrade --install istio-base istio/base \
-n istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1 \
--create-namespace
# Install Istio CRDS cluster2
helm upgrade --install istio-base istio/base \
-n istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_2\
--create-namespace
Install Istiod:
cat << EOF > istiod-common-values.yaml
meshConfig:
accessLogFile: /dev/stdout
defaultConfig:
holdApplicationUntilProxyStarts: true
envoyMetricsService:
address: gloo-mesh-agent.gloo-mesh:9977
envoyAccessLogService:
address: gloo-mesh-agent.gloo-mesh:9977
proxyMetadata:
ISTIO_META_DNS_CAPTURE: "true"
ISTIO_META_DNS_AUTO_ALLOCATE: "true"
pilot:
autoscaleEnabled: false
replicaCount: 1
env:
PILOT_SKIP_VALIDATE_TRUST_DOMAIN: "true"
resources:
requests:
cpu: 10m
memory: 2048Mi
limits:
cpu: 10m
memory: 2048Mi
EOF
# Install istiod cluster1
helm upgrade --install istiod istio/istiod \
-f istiod-common-values.yaml \
--set global.meshID=mesh1 \
--set global.multiCluster.clusterName=$CLUSTER_1 \
--set meshConfig.trustDomain=$CLUSTER_1 \
--set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_1 \
--namespace istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1
# Install istiod cluster2
helm upgrade --install istiod istio/istiod \
-f istiod-common-values.yaml \
--set global.meshID=mesh1 \
--set global.multiCluster.clusterName=$CLUSTER_2 \
--set meshConfig.trustDomain=$CLUSTER_2 \
--set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_2 \
--namespace istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_2
Install ingress gateways:
cat << EOF > istio-ingress-common-values.yaml
replicaCount: 1
autoscaling:
enabled: false
name: istio-ingressgateway
securityContext: # runAsRoot
runAsUser: 1337
runAsGroup: 1337
runAsNonRoot: true
fsGroup: 1337
labels:
istio: ingressgateway
service:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
name: http2
- port: 443
targetPort: 8443
name: https
resources:
limits:
cpu: 10m
memory: 128Mi
requests:
cpu: 10m
memory: 128Mi
EOF
# Install Istio Ingress Gateway Cluster 1
helm upgrade --install istio-ingressgateway istio/gateway \
-f istio-ingress-common-values.yaml \
--namespace istio-gateways \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1 \
--create-namespace
# Install Istio Ingress Gateway Cluster 2
helm upgrade --install istio-ingressgateway istio/gateway \
-f istio-ingress-common-values.yaml \
--namespace istio-gateways \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_2 \
--create-namespace
Install east-west gateways:
cat << EOF > istio-eastwest-common-values.yaml
replicaCount: 1
autoscaling:
enabled: false
name: istio-eastwestgateway
securityContext: # runAsRoot
runAsUser: 1337
runAsGroup: 1337
runAsNonRoot: true
fsGroup: 1337
labels:
istio: eastwestgateway
service:
type: LoadBalancer
ports:
- name: tcp-status-port
port: 15021
targetPort: 15021
- name: tls
port: 15443
targetPort: 15443
resources:
requests:
cpu: 10m
memory: 128Mi
limits:
cpu: 10m
memory: 128Mi
EOF
# Install Istio Eastwest Gateway Cluster 1
helm upgrade --install istio-eastwestgateway istio/gateway \
-f istio-eastwest-common-values.yaml \
--namespace istio-gateways \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1
# Install Istio Eastwest Gateway Cluster 2
helm upgrade --install istio-eastwestgateway istio/gateway \
-f istio-eastwest-common-values.yaml \
--namespace istio-gateways \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_2
Deploy Applications
In workload cluster 1:
kubectl --context ${CLUSTER_1} create ns bookinfo
export bookinfo_yaml=https://raw.githubusercontent.com/istio/istio/1.11.4/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl --context ${CLUSTER_1} label namespace bookinfo istio-injection=enabled
kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'app,version notin (v3)' -n bookinfo
kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'account' -n bookinfo
And in workload cluster 2:
kubectl --context ${CLUSTER_2} create ns bookinfo
kubectl --context ${CLUSTER_2} label namespace bookinfo istio-injection=enabled
kubectl --context ${CLUSTER_2} apply -f ${bookinfo_yaml} -n bookinfo
Define your workspace (this is an abstraction given by Gloo Mesh to facilitate the organization of the workloads regardless the physical location):
kubectl apply --context $MGMT_CLUSTER -n gloo-mesh -f- <<EOF
apiVersion: admin.gloo.solo.io/v2
kind: Workspace
metadata:
name: developers
namespace: gloo-mesh
spec:
workloadClusters:
- name: '*'
namespaces:
- name: '*'
EOF
kubectl apply --context $CLUSTER_1 -n gloo-mesh -f- <<EOF
apiVersion: admin.gloo.solo.io/v2
kind: WorkspaceSettings
metadata:
name: developers
namespace: gloo-mesh
spec:
options:
serviceIsolation:
enabled: false
federation:
enabled: false
EOF
Expose the application:
kubectl --context ${CLUSTER_1} apply -f - <<EOF
apiVersion: networking.gloo.solo.io/v2
kind: VirtualGateway
metadata:
name: north-south-gw
namespace: istio-gateways
spec:
workloads:
- selector:
labels:
istio: ingressgateway
cluster: ${CLUSTER_1}
listeners:
- http: {}
port:
number: 80
allowedRouteTables:
- host: '*'
EOF
kubectl --context ${CLUSTER_1} apply -f - <<EOF
apiVersion: networking.gloo.solo.io/v2
kind: RouteTable
metadata:
name: productpage
namespace: bookinfo
labels:
expose: "true"
spec:
hosts:
- '*'
virtualGateways:
- name: north-south-gw
namespace: istio-gateways
cluster: ${CLUSTER_1}
workloadSelectors: []
http:
- name: productpage
matchers:
- uri:
prefix: /
forwardTo:
destinations:
- ref:
name: productpage
namespace: bookinfo
port:
number: 9080
EOF
Verify the Environment
Next, let’s create a bit of traffic and see what the UI displays. For that, port-forward the Gloo Mesh UI component:
export ENDPOINT_HTTP_GW_CLUSTER1=$(kubectl --context ${CLUSTER_1} -n istio-gateways get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].*}'):80
for i in {0..100}; do curl -s -o /dev/null -w "%{http_code}\n" $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done
You should see:
❯ for i in {0..100}; do curl -s -o /dev/null -w "%{http_code}\n" $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done
200
200
200
200
200
200
Now, let port-forward the UI:
kubectl --context $MGMT_CLUSTER port-forward svc/gloo-mesh-ui -n gloo-mesh 8090
Go to: http://localhost:8090/ and you will see all the information about your clusters and your workspaces.
You can also see the amazing graph to help with understanding your own system: Observability
That is all! You have achieved full control of the network in a matter of minutes in your local environment.
Now, you can test any capability that Gloo Mesh offers, including:
- Any kind of the policies that Gloo Mesh offers (such as failover, fault injection, outlier detection, retries, timeouts, traffic control, mirroring, rate limiting, and header and payload transformation)
- Access control
- Isolation of the services
- WAF
- Authentication with OIDC
- Authorization with OPA
Tips for vcluster
Interested in learning more about vcluster? Here a simple diagram about how vcluster works:
In the workshop you have deployed three vclusters. If you run:
kubectl --context $MAIN_CONTEXT get sts -A
You will see:
NAMESPACE NAME READY AGE
devmgmt devmgmt 1/1 3h7m
devcluster2 devcluster2 1/1 3h2m
devcluster1 devcluster1 1/1 3h3m
Each of these StatefulSets belong to one vcluster. In its attached volume is stored all the data regarding the deployed vcluster.
Getting closer, you will find that one of the containers of those StatefulSets is an entire k3s, a lightweight Kubernetes flavor. You could also use any of the supported kubernetes flavors: eks, k0s and vanilla k8s.
The other container is a syncer, an application which copies the pods that are created within the vcluster to the underlying host cluster. This is the reason you can see all the resources if you are the admin of the “host” cluster, and only your resources if you are the admin of the vcluster.
You can think of the StatefulSet like the control plane of a vcluster. This is the reason why you need to be careful how to deploy its pods.
Let’s see it in your just created environment. In your vcluster, you will see:
kubectl --context $MGMT_CLUSTER get pod -l app=gloo-mesh-mgmt-server -A
NAMESPACE NAME READY STATUS
gloo-mesh gloo-mesh-mgmt-server-9fb55d686-w4n4l 1/1 Running
But in the host cluster you will see:
kubectl --context $MAIN_CONTEXT get pod -A -l vcluster.loft.sh/namespace=gloo-mesh
NAMESPACE NAME
devcluster1 gloo-mesh-agent-df8c8c49d-jlhkh-x-gloo-mesh-x-devcluster1
devcluster2 gloo-mesh-agent-76b5b44b4f-56r5l-x-gloo-mesh-x-devcluster2
devmgmt gloo-mesh-mgmt-server-9fb55d686-w4n4l-x-gloo-mesh-x-devmgmt
devmgmt gloo-mesh-redis-794d79b7df-rlr99-x-gloo-mesh-x-devmgmt
devmgmt gloo-mesh-ui-cc98c5fc-tzq4s-x-gloo-mesh-x-devmgmt
devmgmt prometheus-server-647b488bb-r6hfc-x-gloo-mesh-x-devmgmt
Check the names. That is the translation layer that vcluster makes for you.
There are a couple of things to keep in mind when working with vclusters:
Reserve resources enough for those StatefulSet pods: It is a good practice to have nodes with resources dedicated solely to these pods and make sure that the pods are deployed in those nodes. The intention is that the StatefulSet pods (vcluster control planes) will not run out of resources which would dramatically impact the performance of the vcluster. To do this, you can play with taints and nodeselectors in the nodes.
Logs and Kubernetes metadata: Log Aggregators tools like Fluentbit and Grafana Promtail rely on the Kubernetes structure and naming convention. Log folders and files follow the kubernetes structure given by the host cluster.
From the command above, you could see that the same pod has different names in vcluster and in the host. Therefore, if you deploy one of the observability tools mentioned before in the vcluster, the expected structures will not match the one in the host cluster.The consequence is that the vcluster will not be able to leverage the Kubernetes metadata, nor the log traces from the applications in that cluster. This issue is currently being addressed by the Loft Labs team at the time of writing this post.
The last interesting point to mention is the capability to pause/resume individual vcluster (StatefulSets). In case you do not want to destroy the entire environment created in the workshop you can just do:
vcluster pause $MGMT_CLUSTER -n $MGMT_CLUSTER --context $MAIN_CONTEXT
vcluster pause $CLUSTER_1 -n $CLUSTER_1 --context $MAIN_CONTEXT
vcluster pause $CLUSTER_2 -n $CLUSTER_2 --context $MAIN_CONTEXT
And whenever you want to keep working on the tests you can do:
vcluster resume $MGMT_CLUSTER -n $MGMT_CLUSTER --context $MAIN_CONTEXT
vcluster resume $CLUSTER_1 -n $CLUSTER_1 --context $MAIN_CONTEXT
vcluster resume $CLUSTER_2 -n $CLUSTER_2 --context $MAIN_CONTEXT
Conclusions
Technology changes fast. Not many years ago, we were working with monoliths. Nowadays, you can have clusters deployed within another clusters.
Through this workshop, you were able to:
- Deploy all the components of Gloo Mesh in your local environment or in a cheap remote environment.
- Basic setup to test all Gloo Mesh capabilities to handle east-west and north-south traffic between your services.
- Reduce cost of deploying multiple clusters with vcluster. You just need one actual cluster.
- Reduce time of testing things out in a local environment.
This increases exponentially the efficiency in your projects. Which, at the end, is translated into an increase in productivity.
As a final comment, you can see that being able to test things in your local environment, reproducing heavy remote environments, is one of the goals of the DevOps practices.
If you want to talk more about all these tools, you can find me easily in these Slack workspaces: solo.io, istio and loft.sh
Additional Articles You May Like:
- Kubernetes Multi-Tenancy – A Best Practices Guide
- The Definitive Guide to Development Environments
- Introduction to Virtual Clusters in Kubernetes
- A Hands-on Tutorial: Kubernetes Virtual Clusters
- Kubernetes Multitenancy: Why Namespaces aren’t Good Enough
- Kubernetes Multi-Tenancy with Argo CD And Loft
- Kubernetes Multi-Tenancy: Why Virtual Clusters Are The Best Solution
- [Video] Beyond Namespaces: Virtual Clusters are the Future of Multi-Tenancy
- 5 Tips for Dealing with Kubernetes Day 2 challenges
- Getting the most out of your Delivery Pipeline with Loft & Argo CD
- How Codefresh Uses vcluster to Provide Hosted Argo CD
- What is GitOps and Kubernetes