Checklist for Platform Engineers

The way we do business has drastically changed over the past few decades as more industries go digital. More fault-tolerant and highly scalable systems, quicker responses to defects, and frequent upgrades call for accelerated delivery and increased productivity from our development teams. Meanwhile, many organizations are still trying to migrate from legacy infrastructure to cloud-based architecture. With our developers and DevOps teams already working so hard, how can we achieve this?

#Rise of Platform Engineering

Traditionally, an IT organization would be divided into development and infrastructure teams. The development team would reach out to the infrastructure team for new hardware as needed. The infrastructure team would provision the hardware and sometimes double as the operations team, supporting deployment and monitoring of the product.

As technology has become more sophisticated, businesses have moved to virtualization and cloud-based solutions—such as infrastructure as a service (IaaS), platform as a service (PaaS), or serverless computing—that offer their own separate pros and cons. Most systems actually include several of these components in their architecture, rather than just one.

This adds to our development team’s responsibilities, which is where platform engineering comes in.

#Why Do We Need Platform Engineers?

Platform engineering teams automate and standardize the workflow using tools or internal developer platforms to easily ship and maintain the software. How is this different from what the DevOps team is doing? DevOps teams work on a project-to-project basis, whereas platform engineering teams create platforms that can be used by developers and DevOps teams to perform these tasks without much hassle.

That said, the platform should be built with certain goals in mind:

  • Improve developer productivity: provide a standard approach to raise requests, streamline DevOps processes, and allow developers to use these services as a self-service portal. This simplifies the base processes for each team, reduces ambiguities, and allows automation of repetitive tasks, increasing developer satisfaction and decreasing time to market.
  • Monitor costs: display transparent cost allocations across units as well as invoices from service providers. Cost monitoring allows management to make better financial decisions and standardize the tools used across various teams. This is especially important for start-ups.
  • Maintain security: provide consistent policy enforcement and compliance checks. These platforms allow users to maintain some benchmarks or checkpoints before the software or release hits the market, ensuring the most stable and bug-free version reaches customers.
  • Oversee standards: maintain certain architecture standards. Each organization has its set of standard practices. A platform should ensure these practices are followed to the best of the teams’ abilities.
  • Manage operations: provide a dashboard to centrally monitor and manage the systems for performance analysis, cost monitoring, and troubleshooting.

#Kubernetes and Platform Engineering

Suppose your platform team is working on an internal Kubernetes platform. You will want to automate their most common tasks to speed up delivery. This self-service model considerably reduces the time developers spend performing the same tasks in multiple environments when they could be developing another feature.

You will likely want to implement certain restrictions, limits, quotas, or security policies for your Kubernetes clusters. This could help with auditing or monitoring tasks, or with standardizing a quota for certain resources. Tools like the Open Policy Agent (OPA), jsPolicy, or Kyverno can be used based on your needs. Many developers are more comfortable with YAML or JavaScript, so Kyverno or jsPolicy might be preferred.

You will also want cost-related insights or metrics for future decision-making, especially if you are working with a start-up organization. Kubecost, which provides real-time data for Kubernetes users, may be a good option for you.

Kubernetes was designed as a single-tenant platform. Sharing clusters, though, offers greater flexibility, simplifies infrastructure, and improves cost-efficiency. Therefore, it makes sense to use a multi-tenant system. To keep tenants separate and prevent compromised tenants from affecting others, you can use role-based access control (RBAC) or namespaces. Tools that assist with multi-tenancy in Kubernetes include kiosk and loft.

#What Makes a Good Platform?

Here are some elements of good platform engineering to keep in mind so you can meet and overcome any possible challenges.

#Set Boundaries

It is important to set clear boundaries about the specific duties of the platform team and of the development team. If those responsibilities are not properly communicated from the start, the teams might blame each other for workflow issues.

#Empower Developers

Platforms are meant to increase productivity and decrease developer effort. If the developers are dependent on the platform teams and frequently require their input, that can hurt the development process. You can give the development team more independence by setting a clear separation of responsibilities between the teams and using proper documentation, among other things. Exposing APIs and automating steps also help reduce human effort and error.

#Remember Less Is More

To optimize your platform use, you want to standardize tools and streamline processes. This is why the platform team needs to define non-negotiables and choices of tools or stacks.

You might choose AWS or Azure for infrastructure provisioning, or vcluster, DevSpace, or Tilt for a container platform. You might provide templates for CI/CD pipelines with various alternatives and certain mandatory steps. The latter might even help maintain compliance checks—for example, the code should have more than 75 percent test coverage before a successful build for deployment; otherwise, fail the build if more than x third-party dependencies are of a certain level of vulnerability.

#Automate, Automate, Automate

End users, who in this case are developers, want tools that help them work as quickly and independently as possible and deliver as much value to the customer as they can. This is when the self-service element and automation become a necessity. Without this, the cycle time can easily go from hours to days and even weeks for organizations working on legacy systems.

#Require Reliability

If the platform is not stable, you might anger not just your clients but their clients as well. The developer teams should never be blocked from deploying a critical fix due to the instability of the platform. Since multiple development teams will use this platform, its reliability becomes even more critical.


If you are migrating from legacy infrastructure to a cloud-based architecture, if you want to speed up delivery and reduce time to market, or if you are trying to bridge the gap between hardware and software infrastructure, you need platform engineering. Chances are, you are already doing platform engineering without using the term.

The main idea behind platform engineering is speeding up workflow and delivery, so automating mundane, repetitive tasks on your platform is crucial. You should also be able to monitor product costs and performance on your platform to help you make informed decisions.

If these practices are implemented well, your customers will consistently receive a secure, high-quality product.

Photo by Charles Forerunner on Unsplash