Table of Contents
AI pipelines streamline the process of building and deploying machine learning (ML) workflows, enhancing scalability and portability. Kubernetes plays a pivotal role in managing these pipelines, allowing for effortless deployment, scaling, and operations of application containers across clusters of hosts. Kubeflow, built on Kubernetes, simplifies the deployment of ML workflows even further. As a cloud-native platform inspired by Google’s internal ML pipelines, Kubeflow provides an array of components, including a central dashboard and notebooks, to manage and scale machine learning models effectively and make AI pipelines more accessible and efficient.
You may have already noticed a pattern here. When it comes to deploying ML workflows, scalability and efficiency are paramount. This is where vCluster comes into play.
In this tutorial, you’ll discover how vCluster can take the benefits of building AI pipelines in Kubernetes to the next level by leveraging a single Kubernetes cluster to quickly launch virtual environments for testing, development, or production.
#Benefits of vCluster for AI Pipelines
You can think of vCluster as a powerful open source tool for creating virtual Kubernetes clusters. These virtual clusters are also fully functional Kubernetes clusters that are isolated from the host cluster. The benefits of this approach include:
- Better cost control: You can launch or destroy virtual clusters in seconds. This allows your team to easily create ML development and testing environments without scaling the entire physical cluster, thus allowing enhanced resource utilization and incurring lower overall costs.
- Security: Clusters allow better resource isolation and access control so that only authorized users have access to them.
- Being platform-agnostic: Regardless of whether your organization needs to set up clusters locally or remotely, and regardless if it’s a multicluster or hybrid deployment, your team will be able to deploy virtual clusters with the same ease.
In other words, vCluster allows you to optimize the resources of your Kubernetes cluster and create isolated and secure environments where you can build AI pipelines. On top of all this, setting up a virtual cluster is very straightforward.
#Setting Up vCluster for Kubernetes-Based AI Pipelines
This section will cover the three-step process for using vCluster for ML pipelines: installing the
vcluster CLI, configuring vCluster for Kubernetes-based AI pipelines, and deploying virtual clusters within the Kubernetes cluster.
#Installing the vCluster CLI
Before installing the vCluster CLI, ensure you have:
- kubectl and Helm command line tools installed on your local workstation
- A kubeconfig file with the appropriate credentials to access the Kubernetes cluster where you want to deploy virtual clusters
Once you have those, you can download and install the latest version of vCluster using the appropriate script for your architecture. You can also download the vCluster binary manually from the GitHub repository if you want to build the binary from the source or install beta versions.
For example, if you use a Mac, you can install vCluster using Homebrew:
brew install vcluster
#Configuring vCluster for Kubernetes-Based AI Pipelines
Now that you have the
vcluster CLI, you can start to configure vCluster according to your project needs.
vCluster uses Helm charts to deploy virtual clusters, meaning you can edit
values.yaml to customize its configuration. Start by adding the
vcluster repository to your machine:
helm repo add loft-sh https://charts.loft.sh helm repo update
Then, search for available vCluster charts:
$ helm search repo loft-sh | grep vcluster NAME CHART VERSION APP VERSION DESCRIPTION ... loft-sh/vcluster 0.15.7 0.15.7 vcluster - Virtual Kubernetes Clusters loft-sh/vcluster-eks 0.15.7 0.15.7 vcluster - Virtual Kubernetes Clusters (eks) loft-sh/vcluster-k0s 0.15.7 0.15.7 vcluster - Virtual Kubernetes Clusters (k0s) loft-sh/vcluster-k8s 0.15.7 0.15.7 vcluster - Virtual Kubernetes Clusters (k8s) ...
Let’s say you choose the default chart,
loft-sh/vcluster, which uses K3s. In that case, you can download and extract the chart by running this command:
helm pull loft-sh/vcluster && tar -xvf vcluster-0.15.7.tgz
You can now navigate to the
vcluster directory and edit
values.yaml. While each project’s requirements are different, there is some common ground. For example, depending on the components you use in your ML pipeline, you may need to edit
values.yaml to enable services like CoreDNS, ServiceLB, or Metrics Server, which are disabled by default:
# Virtual cluster (K3s) configuration vcluster: # Image to use for the virtual cluster image: rancher/k3s:v1.27.3-k3s1 command: - /bin/k3s baseArgs: - server - --write-kubeconfig=/data/k3s-config/kube-config.yaml - --data-dir=/data - --disable=traefik,servicelb,metrics-server,local-storage,coredns - --disable-network-policy - --disable-agent - --disable-cloud-controller - --flannel-backend=none
Besides monitoring, another crucial aspect in production environments is high availability (HA). In this sense, vCluster offers high-availability support for K3s and vanilla Kubernetes. Implementing high availability with vCluster also involves enabling external data storage for K3s, which is covered in detail in the documentation.
Here is an example of HA configuration for K3s:
# Enable HA mode enableHA: true # Scale up K3s replicas replicas: 2 # Set external data store endpoint vcluster: env: - name: K3S_DATASTORE_ENDPOINT value: mysql://username:password@tcp(hostname:3306)/database-name # Disable persistent storage as all data (including bootstrap data) is stored in external data store storage: persistence: false # Scale up CoreDNS replicas coredns: replicas: 2
As you can see, the example uses MySQL as the external data store endpoint for the cluster. You may also notice that K3s and CoreDNS replicas are scaled up. Speaking of scaling virtual clusters, the number of replicas is not the only aspect to consider. When it comes to running ML pipelines, you must ensure that the virtual cluster has sufficient resources. The following are the default values that vCluster uses for K3s-based virtual clusters:
# Virtual cluster (K3s) configuration vcluster: ... env:  resources: limits: memory: 2Gi requests: cpu: 200m memory: 256Mi ... # Storage settings for the virtual cluster storage: # If this is disabled, vCluster will use an emptyDir instead # of a PersistentVolumeClaim persistence: true # Size of the persistent volume claim size: 5Gi ...
By editing these values, you can increase memory, CPU, and storage as needed. When assigning resources to your virtual cluster, keep in mind that vCluster has a low overhead on the host by design, so you don’t have to worry about wasting resources. Moreover, you can quickly launch, update, and destroy virtual clusters, allowing you to experiment and adjust parameters more freely.
Aside from resource allocation, another aspect to consider is maintaining consistency across pipelines when configuring vCluster.
In that sense, following GitOps best practices is recommended. A possible strategy would be to use Terraform’s Helm provider to deploy virtual clusters using Helm. Another strategy would be to implement vCluster’s Cluster API provider to create virtual clusters programmatically. In both cases, all changes could be tracked using Git, allowing your team to roll back any settings quickly. Beyond helping maintain consistency across pipelines, these tools allow your team to automate the deployment of ML pipelines.
Overall, vCluster provides great flexibility to customize virtual clusters. As you’ll see below, this flexibility will allow you to deploy virtual clusters with different configurations and thus optimize resources.
#Deploying a Virtual Cluster Using the vCluster CLI
Deploying a virtual cluster with the
vcluster CLI is as simple as running:
vcluster create my-vcluster
However, that would deploy K3s using the default values from
values.yaml. Suppose you need to create a virtual cluster that will only run the experimental phase of your ML pipeline. You could use the following command:
vcluster create testing-pipeline-01 -f testing-config-01.yaml
This command refers to the configuration file
testing-config-01.yaml (also known as
values.yaml) that Helm will use to set up the virtual cluster. Furthermore, nothing prevents you from deploying other Kubernetes distributions using the
vcluster create prod-pipeline-02 -f prod-config-02.yaml --distro k8s
Simply put, deploying a virtual cluster is a trivial task once you properly configure it by following the discussed recommendations.
#Building AI Model Pipelines with Kubeflow and vCluster
vCluster’s convenience and versatility are applicable to countless use cases. However, you can make use of your newfound knowledge for a specific scenario: building an AI model pipeline with vCluster.
Kubeflow’s modular architecture allows data scientists, ML engineers, and operations teams to build and deploy portable and scalable ML workflows using Kubeflow Pipelines (KFP), along with the tools that they consider necessary, such as Jupyter, PyTorch, TensorFlow, and Katib, among others. In other words, Kubeflow leaves it up to each organization which components to use based on their particular use case.
vCluster fits perfectly into this context, as it can be configured to create tailored development, test, and production environments for machine learning systems.
To this end, you could deploy a virtual cluster with the required resources:
vcluster create kubeflow -f kubeflow.yaml
Then, connect to the newly created
kubeflow virtual cluster and install Kubeflow with a single command:
while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
Now, if you only need to create a basic AI model pipeline with Kubeflow, you could install a lightweight standalone version of Kubeflow Pipelines instead of a full-fledged Kubeflow deployment.
The procedure would be similar. First, deploy a customized virtual cluster:
vcluster create kfp -f kfp.yaml
Then, connect to the virtual cluster and run the following command:
export PIPELINE_VERSION=2.0.1 kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION" kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"
In both cases, you can configure the ingress resource and load balancer service to expose the virtual cluster via
To recap, the way to build AI pipelines within a virtual cluster is the same as for a non-virtualized Kubernetes cluster, which is a big plus. No steep learning curve, no extra work. The only thing engineers have to do is configure the virtual cluster according to their needs, which they can easily do with Helm.
All in all, vCluster’s role in building next-generation AI pipelines is promising. Its incredible versatility allows MLOps teams to use a comprehensive Kubernetes-native platform like Kubeflow as well as any other tooling they require to develop cutting-edge ML models. Moreover, vCluster’s ability to replicate the behavior of “real” Kubernetes clusters is significant as AI development moves towards more complex, distributed systems that demand robust pipeline versioning, enhanced collaboration features, and the integration of more sophisticated AI models. Overall, vCluster makes it easy to create development and testing environments that can be provisioned instantly without scaling the entire physical cluster, thus saving time and money.
That said, the most exciting aspect is that vCluster is also future-proof. As the ability to build and manage AI pipelines across different cloud environments becomes increasingly vital, your organization can count on vCluster. Regardless of whether you adopt a multicloud or hybrid cloud strategy, your organization will be able to deploy virtual clusters without breaking a sweat.