Development Environments with vcluster

Antonio Berben

Jul 18, 2022

15 min read

Get Started Now

#vcluster Series

At Solo.io, we listen to the community and try out the best technologies to help teams meet their goals. This includes both working on open source projects, as well as providing support and products that can help you better leverage technologies.

Gloo Mesh is one of those products. It provides a good example of how to reduce the complexity of managing the entire application networking in your infrastructure to a minimum. As can be understood, this implies multi-cluster architectures.

In such scenarios, how can you verify that a multi-cluster configuration is correct in a local environment before moving to a more extensive environment?

Let’s put it in context. Your team (or the development team) wants to release a new feature. They want to cause some chaos in the system. Gloo Mesh offers this functionality and many other through policies (FailOver, fault injection, outlier detection, retries, timeouts, mirroring, rate limiting, and more). But you, as an operator of the platform and Gloo Mesh, may not be sure which is the correct configuration. You need to investigate first in a development or testing environment.

In a simulated production scenario that uses three clusters (one for management and two for workloads), the first concern is obvious: Cost. Deploying three clusters in public clouds is expensive.

The costs are all multiplied by 3 for 3 clusters

The second concern: networking. Let’s say you decide to investigate first in your local environment. Deploying three entire clusters in your own workstation is not easy. You can opt for solutions like multiple kind (kubernetes-in-docker) or k3d. Both deploy clusters in containers on top of the host machine. One cluster, one container. If you try one of these approaches, you probably have to tweak the network between the containers and the host machine.

The third concern: CPU. To deploy things in your own local environment, you need to make sure you have enough “muscle”.

Now… What if we start considering “A cluster within a cluster”?

#vcluster

I hope you saw the iconic movie Inception. I enjoyed it a lot and I watch it again from time to time. The idea was pretty catchy: “A dream within a dream”.

Virtualization technology follows the same idea. If you are familiar with Docker, years ago there was the need for docker-in-docker. Nowadays it is a very common approach in CI/CD pipelines. Say for example that tasks are running in a container but you need to test an application already embedded in another container. This would be a use case of docker-in-docker.

Given that idea, what stops us from trying cluster-in-cluster? This is where vcluster comes in to offer some benefits. vcluster allows you to create and manage virtual Kubernetes clusters. A virtual cluster is basically a control plane that runs in a namespace on a shared host custer. Here a visualization:

Drawing of a management cluster and two workload clusters

In the picture we can see that Gloo Mesh, which before required three clusters to simulate a production-ready environment, now just needs one cluster with three virtual clusters.

Quick benefits:

Cost effective: Now, your cost is only one cluster. It is true that it needs to be bigger than before, but you’re saving money by deploying one cluster instead of three.
Time-saving: when you work in your local environment, you do not want to spend time creating new clusters. If you use kind, it can take several minutes to get three new clusters. With vcluster, you can get your three new clusters in about 20 seconds.

Let’s prove all this in a workshop.

#Hands on!

In this workshop, in a matter of seconds, you will deploy Istio in the two workload clusters, a demo application to use in your labs, and Gloo Mesh to test the application networking capabilities (multi-cluster traffic, traffic splitting, fault injection, etc.). All this is based on just one host Kubernetes cluster containing three virtual clusters.

Your architecture will look like this:

Architecture diagram

#Prerequisites

A Kubernetes cluster which will be the host cluster (kind, k3s, k0s, etc.)
vcluster CLI. This has been tested with version 0.10.2
Helm v3
Kubectl
meshctl

#Getting Started

Let’s check on how long it takes you to deploy everything. The test was made using a virtual machine with only three CPUs. Therefore, you will also deploy components with minimum resources.

You start with setting up some environment variables:

# Context name for the host cluster
export MAIN_CONTEXT=$(kubectl config current-context)

# Context names for the gloo mesh clusters (vclusters)
export MGMT_CLUSTER=devmgmt
export CLUSTER_1=devcluster1
export CLUSTER_2=devcluster2

#Install environments

First, let’s create management cluster:

cat << EOF > vcluster-values.yaml
isolation:
  enabled: false
  limitRange:
    enabled: false
  podSecurityStandard: privileged
  resourceQuota:
    enabled: false
rbac:
  clusterRole:
    create: true
syncer:
  resources:
    limits:
      cpu: 100m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 128Mi
  extraArgs:
  - --fake-nodes=false
  - --sync-all-nodes
vcluster:
  resources:
    limits:
      cpu: 200m
      memory: 2Gi
    requests:
      cpu: 100m
      memory: 256Mi
  extraArgs:
  - --kubelet-arg=allowed-unsafe-sysctls=net.ipv4.*
  - --kube-apiserver-arg=feature-gates=EphemeralContainers=true
  - --kube-scheduler-arg=feature-gates=EphemeralContainers=true
  - --kubelet-arg=feature-gates=EphemeralContainers=true
  image: rancher/k3s:v1.22.5-k3s1
EOF


vcluster create $MGMT_CLUSTER -n $MGMT_CLUSTER --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT

vcluster connect $MGMT_CLUSTER -n $MGMT_CLUSTER --kube-config-context-name $MGMT_CLUSTER --update-current --context $MAIN_CONTEXT

kubectl --context $MGMT_CLUSTER get namespaces

Next, the workload cluster 1:

vcluster create $CLUSTER_1 -n $CLUSTER_1 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT

vcluster connect $CLUSTER_1 -n $CLUSTER_1 --kube-config-context-name $CLUSTER_1 --update-current --context $MAIN_CONTEXT

kubectl --context $CLUSTER_1 get namespaces

And finally, the workload cluster 2:

vcluster create $CLUSTER_2 -n $CLUSTER_2 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT

vcluster connect $CLUSTER_2 -n $CLUSTER_2 --kube-config-context-name $CLUSTER_2 --update-current --context $MAIN_CONTEXT

kubectl --context $CLUSTER_2 get namespaces

This is it! Three clusters in around 20 seconds. If you’re interested to know more, at the end of this post, you can find a more in-depth explanation of what you have deployed with vcluster and some tips to remember.

Now, time for Istio to be deployed in the workload clusters:

#Install Gloo Mesh

You will need a license key:

export GLOO_MESH_LICENSE_KEY=<license_key>

And you need to define the Gloo Mesh version:

export GLOO_MESH_VERSION=2.0.9

Gloo Mesh can be installed through Helm charts. However, to not overflow this post with code, you will use the meshctl CLI:

meshctl install --kubecontext $MGMT_CLUSTER --license $GLOO_MESH_LICENSE_KEY --version $GLOO_MESH_VERSION

Verify all pods are running:

kubectl get pods -n gloo-mesh --context $MGMT_CLUSTER

And you will see something like:

NAME                                     READY   STATUS    RESTARTS   AGE
gloo-mesh-mgmt-server-778d45c7b5-5d9nh   1/1     Running   0          41s
gloo-mesh-redis-844dc4f9-jnb4j           1/1     Running   0          41s
gloo-mesh-ui-749dc7875c-4z77k            3/3     Running   0          41s
prometheus-server-86854b778-r6r52        2/2     Running   0          41s

#Register workload clusters

Gloo Mesh relies on an agent-based approach. Therefore, when registering a workload cluster, you will need to tell the agent how to communicate with the management server.

Note that in EKS the service does not return an IP, but an Address. Please make that adjustment the following commands if you’re using EKS.

MGMT_SERVER_NETWORKING_DOMAIN=$(kubectl get svc -n gloo-mesh gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

MGMT_SERVER_NETWORKING_PORT=$(kubectl -n gloo-mesh get service gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.spec.ports[?(@.name=="grpc")].port}')


MGMT_SERVER_NETWORKING_ADDRESS=${MGMT_SERVER_NETWORKING_DOMAIN}:${MGMT_SERVER_NETWORKING_PORT}
echo $MGMT_SERVER_NETWORKING_ADDRESS

meshctl cluster register \
  --remote-context=$CLUSTER_1 \
  --relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \
  --kubecontext $MGMT_CLUSTER \
  $CLUSTER_1

And you will see:

Registering cluster
📃 Copying root CA relay-root-tls-secret.gloo-mesh to remote cluster from management cluster
📃 Copying bootstrap token relay-identity-token-secret.gloo-mesh to remote cluster from management cluster
💻 Installing relay agent in the remote cluster
Finished installing chart 'gloo-mesh-agent' as release gloo-mesh:gloo-mesh-agent
📃 Creating remote.cluster KubernetesCluster CRD in management cluster
⌚ Waiting for relay agent to have a client certificate
         Checking...
         Checking...
🗑 Removing bootstrap token
✅ Done registering cluster!

meshctl cluster register \
  --remote-context=$CLUSTER_2 \
  --relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \
  --kubecontext $MGMT_CLUSTER \
  $CLUSTER_2

#### Check that the resource is created in management:

```Bash
kubectl get kubernetescluster -n gloo-mesh --context $MGMT_CLUSTER

And you will see:

NAME           AGE
devcluster1    27s
devcluster2    23s

#Install Istio

Istio by default requires some resources. In your local environment, you might not have the resources to deploy three clusters fully functional and two Istio service meshes. Therefore, we need to reduce the required resources for Istio. That’s fine as this is just a development environment.

NOTE: This post is using Istio v1.12.6:

export ISTIO_VERSION=1.12.6

Install Istio’s CRDs:

# Install Istio CRDS cluster1
helm upgrade --install istio-base istio/base \
 -n istio-system \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_1 \
 --create-namespace

# Install Istio CRDS cluster2
helm upgrade --install istio-base istio/base \
 -n istio-system \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_2\
 --create-namespace

Install Istiod:

cat << EOF > istiod-common-values.yaml
meshConfig:
 accessLogFile: /dev/stdout
 defaultConfig:
   holdApplicationUntilProxyStarts: true
   envoyMetricsService:
     address: gloo-mesh-agent.gloo-mesh:9977
   envoyAccessLogService:
     address: gloo-mesh-agent.gloo-mesh:9977
   proxyMetadata:
     ISTIO_META_DNS_CAPTURE: "true"
     ISTIO_META_DNS_AUTO_ALLOCATE: "true"
pilot:
 autoscaleEnabled: false
 replicaCount: 1
 env:
   PILOT_SKIP_VALIDATE_TRUST_DOMAIN: "true"
 resources:
   requests:
     cpu: 10m
     memory: 2048Mi
   limits:
     cpu: 10m
     memory: 2048Mi
EOF

# Install istiod cluster1
helm upgrade --install istiod istio/istiod \
 -f istiod-common-values.yaml \
 --set global.meshID=mesh1 \
 --set global.multiCluster.clusterName=$CLUSTER_1 \
 --set meshConfig.trustDomain=$CLUSTER_1 \
 --set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_1 \
 --namespace istio-system \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_1

# Install istiod cluster2
helm upgrade --install istiod istio/istiod \
 -f istiod-common-values.yaml \
 --set global.meshID=mesh1 \
 --set global.multiCluster.clusterName=$CLUSTER_2 \
 --set meshConfig.trustDomain=$CLUSTER_2 \
 --set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_2 \
 --namespace istio-system \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_2

Install ingress gateways:

cat << EOF > istio-ingress-common-values.yaml
replicaCount: 1
autoscaling:
  enabled: false
name: istio-ingressgateway
securityContext: # runAsRoot
  runAsUser: 1337
  runAsGroup: 1337
  runAsNonRoot: true
  fsGroup: 1337
labels:
 istio: ingressgateway
service:
 type: LoadBalancer
 ports:
 - port: 80
   targetPort: 8080
   name: http2
 - port: 443
   targetPort: 8443
   name: https
resources:
 limits:
   cpu: 10m
   memory: 128Mi
 requests:
   cpu: 10m
   memory: 128Mi
EOF

# Install Istio Ingress Gateway Cluster 1
helm upgrade --install istio-ingressgateway istio/gateway \
 -f istio-ingress-common-values.yaml \
 --namespace istio-gateways \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_1 \
 --create-namespace

# Install Istio Ingress Gateway Cluster 2
helm upgrade --install istio-ingressgateway istio/gateway \
 -f istio-ingress-common-values.yaml \
 --namespace istio-gateways \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_2 \
 --create-namespace

Install east-west gateways:

cat << EOF > istio-eastwest-common-values.yaml
replicaCount: 1
autoscaling:
  enabled: false
name: istio-eastwestgateway
securityContext: # runAsRoot
  runAsUser: 1337
  runAsGroup: 1337
  runAsNonRoot: true
  fsGroup: 1337
labels:
 istio: eastwestgateway
service:
 type: LoadBalancer
 ports:
 - name: tcp-status-port
   port: 15021
   targetPort: 15021
 - name: tls
   port: 15443
   targetPort: 15443
resources:
 requests:
   cpu: 10m
   memory: 128Mi
 limits:
   cpu: 10m
   memory: 128Mi
EOF

# Install Istio Eastwest Gateway Cluster 1
helm upgrade --install istio-eastwestgateway istio/gateway \
 -f istio-eastwest-common-values.yaml \
 --namespace istio-gateways \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_1

# Install Istio Eastwest Gateway Cluster 2
helm upgrade --install istio-eastwestgateway istio/gateway \
 -f istio-eastwest-common-values.yaml \
 --namespace istio-gateways \
 --version $ISTIO_VERSION \
 --kube-context $CLUSTER_2

#Deploy Applications

In workload cluster 1:

kubectl --context ${CLUSTER_1} create ns bookinfo
export bookinfo_yaml=https://raw.githubusercontent.com/istio/istio/1.11.4/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl --context ${CLUSTER_1} label namespace bookinfo istio-injection=enabled

kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'app,version notin (v3)' -n bookinfo

kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'account' -n bookinfo

And in workload cluster 2:

kubectl --context ${CLUSTER_2} create ns bookinfo

kubectl --context ${CLUSTER_2} label namespace bookinfo istio-injection=enabled

kubectl --context ${CLUSTER_2} apply -f ${bookinfo_yaml} -n bookinfo

Define your workspace (this is an abstraction given by Gloo Mesh to facilitate the organization of the workloads regardless the physical location):

kubectl apply --context $MGMT_CLUSTER -n gloo-mesh -f- <<EOF
apiVersion: admin.gloo.solo.io/v2
kind: Workspace
metadata:
  name: developers
  namespace: gloo-mesh
spec:
  workloadClusters:
  - name: '*'
    namespaces:
    - name: '*'
EOF

kubectl apply --context $CLUSTER_1 -n gloo-mesh -f- <<EOF
apiVersion: admin.gloo.solo.io/v2
kind: WorkspaceSettings
metadata:
 name: developers
 namespace: gloo-mesh
spec:
 options:
   serviceIsolation:
     enabled: false
   federation:
     enabled: false
EOF

Expose the application:

kubectl --context ${CLUSTER_1} apply -f - <<EOF
apiVersion: networking.gloo.solo.io/v2
kind: VirtualGateway
metadata:
  name: north-south-gw
  namespace: istio-gateways
spec:
  workloads:
    - selector:
        labels:
          istio: ingressgateway
        cluster: ${CLUSTER_1}
  listeners:
    - http: {}
      port:
        number: 80
      allowedRouteTables:
        - host: '*'
EOF


kubectl --context ${CLUSTER_1} apply -f - <<EOF
apiVersion: networking.gloo.solo.io/v2
kind: RouteTable
metadata:
  name: productpage
  namespace: bookinfo
  labels:
    expose: "true"
spec:
  hosts:
    - '*'
  virtualGateways:
    - name: north-south-gw
      namespace: istio-gateways
      cluster: ${CLUSTER_1}
  workloadSelectors: []
  http:
    - name: productpage
      matchers:
      - uri:
          prefix: /
      forwardTo:
        destinations:
          - ref:
              name: productpage
              namespace: bookinfo
            port:
              number: 9080
EOF

#Verify the Environment

Next, let’s create a bit of traffic and see what the UI displays. For that, port-forward the Gloo Mesh UI component:

export ENDPOINT_HTTP_GW_CLUSTER1=$(kubectl --context ${CLUSTER_1} -n istio-gateways get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].*}'):80

for i in {0..100}; do curl -s -o /dev/null -w "%{http_code}\n" $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done

You should see:

❯ for i in {0..100}; do curl -s -o /dev/null -w "%{http_code}\n" $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done
200
200
200
200
200
200

Now, let port-forward the UI:

kubectl --context $MGMT_CLUSTER port-forward svc/gloo-mesh-ui -n gloo-mesh 8090

Go to: http://localhost:8090/ and you will see all the information about your clusters and your workspaces.

Gloo Mesh UI

You can also see the amazing graph to help with understanding your own system: Observability

Gloo Mesh graph

That is all! You have achieved full control of the network in a matter of minutes in your local environment.

Now, you can test any capability that Gloo Mesh offers, including:

Any kind of the policies that Gloo Mesh offers (such as failover, fault injection, outlier detection, retries, timeouts, traffic control, mirroring, rate limiting, and header and payload transformation)
Access control
Isolation of the services
WAF
Authentication with OIDC
Authorization with OPA

#Tips for vcluster

Interested in learning more about vcluster? Here a simple diagram about how vcluster works:

vcluster architecture diagram

In the workshop you have deployed three vclusters. If you run:

kubectl --context $MAIN_CONTEXT get sts -A

You will see:

NAMESPACE     NAME          READY   AGE
devmgmt       devmgmt       1/1     3h7m
devcluster2   devcluster2   1/1     3h2m
devcluster1   devcluster1   1/1     3h3m

Each of these StatefulSets belong to one vcluster. In its attached volume is stored all the data regarding the deployed vcluster.

Getting closer, you will find that one of the containers of those StatefulSets is an entire k3s, a lightweight Kubernetes flavor. You could also use any of the supported kubernetes flavors: eks, k0s and vanilla k8s.

The other container is a syncer, an application which copies the pods that are created within the vcluster to the underlying host cluster. This is the reason you can see all the resources if you are the admin of the “host” cluster, and only your resources if you are the admin of the vcluster.

You can think of the StatefulSet like the control plane of a vcluster. This is the reason why you need to be careful how to deploy its pods.

Let’s see it in your just created environment. In your vcluster, you will see:

kubectl --context $MGMT_CLUSTER get pod -l app=gloo-mesh-mgmt-server -A

NAMESPACE     NAME                                    READY   STATUS
gloo-mesh     gloo-mesh-mgmt-server-9fb55d686-w4n4l   1/1     Running

But in the host cluster you will see:

kubectl --context $MAIN_CONTEXT get pod -A -l vcluster.loft.sh/namespace=gloo-mesh

NAMESPACE     NAME
devcluster1   gloo-mesh-agent-df8c8c49d-jlhkh-x-gloo-mesh-x-devcluster1
devcluster2   gloo-mesh-agent-76b5b44b4f-56r5l-x-gloo-mesh-x-devcluster2
devmgmt       gloo-mesh-mgmt-server-9fb55d686-w4n4l-x-gloo-mesh-x-devmgmt
devmgmt       gloo-mesh-redis-794d79b7df-rlr99-x-gloo-mesh-x-devmgmt
devmgmt       gloo-mesh-ui-cc98c5fc-tzq4s-x-gloo-mesh-x-devmgmt
devmgmt       prometheus-server-647b488bb-r6hfc-x-gloo-mesh-x-devmgmt

Check the names. That is the translation layer that vcluster makes for you.

There are a couple of things to keep in mind when working with vclusters:

Reserve resources enough for those StatefulSet pods: It is a good practice to have nodes with resources dedicated solely to these pods and make sure that the pods are deployed in those nodes. The intention is that the StatefulSet pods (vcluster control planes) will not run out of resources which would dramatically impact the performance of the vcluster. To do this, you can play with taints and nodeselectors in the nodes.

Logs and Kubernetes metadata: Log Aggregators tools like Fluentbit and Grafana Promtail rely on the Kubernetes structure and naming convention. Log folders and files follow the kubernetes structure given by the host cluster.

From the command above, you could see that the same pod has different names in vcluster and in the host. Therefore, if you deploy one of the observability tools mentioned before in the vcluster, the expected structures will not match the one in the host cluster.The consequence is that the vcluster will not be able to leverage the Kubernetes metadata, nor the log traces from the applications in that cluster. This issue is currently being addressed by the Loft Labs team at the time of writing this post.

The last interesting point to mention is the capability to pause/resume individual vcluster (StatefulSets). In case you do not want to destroy the entire environment created in the workshop you can just do:

vcluster pause $MGMT_CLUSTER -n $MGMT_CLUSTER  --context $MAIN_CONTEXT
vcluster pause $CLUSTER_1 -n $CLUSTER_1  --context $MAIN_CONTEXT
vcluster pause $CLUSTER_2 -n $CLUSTER_2  --context $MAIN_CONTEXT

And whenever you want to keep working on the tests you can do:

vcluster resume $MGMT_CLUSTER -n $MGMT_CLUSTER  --context $MAIN_CONTEXT
vcluster resume $CLUSTER_1 -n $CLUSTER_1  --context $MAIN_CONTEXT
vcluster resume $CLUSTER_2 -n $CLUSTER_2  --context $MAIN_CONTEXT

#Conclusions

Technology changes fast. Not many years ago, we were working with monoliths. Nowadays, you can have clusters deployed within another clusters.

Through this workshop, you were able to:

Deploy all the components of Gloo Mesh in your local environment or in a cheap remote environment.
Basic setup to test all Gloo Mesh capabilities to handle east-west and north-south traffic between your services.
Reduce cost of deploying multiple clusters with vcluster. You just need one actual cluster.
Reduce time of testing things out in a local environment.

This increases exponentially the efficiency in your projects. Which, at the end, is translated into an increase in productivity.

As a final comment, you can see that being able to test things in your local environment, reproducing heavy remote environments, is one of the goals of the DevOps practices.

If you want to talk more about all these tools, you can find me easily in these Slack workspaces: solo.io, istio and loft.sh