From Infrastructure to Workload: Using GitOps and CNCF Principles to Bridge the Gap

From Infrastructure to Workload: Using GitOps and CNCF Principles to Bridge the Gap

written by Marius Bledea (DevOps Engineer) and Lucas Donca (DevOps Engineer) in the January 2023 issue of Today Software Magazine.

Read the article in Romanian here

In the present historical setting, GitOps is mainly successfully resolving the workload issue on the Kubernetes cluster from the reconciliation point of view between the actual and desired state.

Git is used as a single source of truth for declarative infrastructure and application code in the software engineering process known as "GitOps". In GitOps, the system's intended state is saved in Git. An automated system then implements such modifications in the target environment when system updates are made and committed to the Git repository.

By using this strategy, it may be possible to maintain the system's known-good state and make modifications that can be readily audited and undone as needed. GitOps may be used to manage both applications and infrastructure, and it works especially well in Cloud-native systems where the infrastructure is dynamic and ephemeral.

A nonprofit group called the Cloud Native Computing Foundation (CNCF) works to encourage the use of Cloud-native technology. It offers a framework for Cloud-native concepts and hosts open-source projects, including Kubernetes, Prometheus, and Envoy.

According to the CNCF, the following ideas are crucial to Cloud native computing:

  • Application packaging into containers to increase mobility and enable resource separation is known as containerization.

  • Dynamic orchestration: automating the deployment, scaling, and administration of containerized applications using technologies like Kubernetes.

  • Microservices: the division of large applications into smaller, independent parts that may be independently created, tested, and deployed.

  • Continuous delivery enables quick and dependable delivery of software updates by automating the build, test, and release processes.

But for applications to be 100% compatible with Cloud environments, the infrastructure on which they run must also have the ability to reconcile the current state with the desirable state.

This article addresses the following issues:

  • high cost of non-stop running infrastructure development.

  • long waiting times (for Dev, QA and Ops teams).

  • the increasing complexity of the infrastructure.

A new shift-left paradigm aims to identify and correct errors as early as possible in the development cycle to prevent costly or time-consuming interventions.

Most frequently, environments that attempt to simulate production at a Dev/QA/Staging level will be encountered. The underlying infrastructure is all running round-the-clock, which raises expenses and adds to the waiting time for each member of the previously mentioned teams to be able to perform their regular tasks.

It's crucial for developers and testers to put more emphasis on producing high-quality work than worrying about the underlying infrastructure. However, whether development, testing, and pre-production environments should closely resemble production settings is frequently a topic of discussion from a cost viewpoint. Some contend that in order to guarantee that the program will operate as intended when it is deployed, it is essential for these environments to be as identical to production as reasonable.

A 1:1 with a production environment that is deployed event-based is frequently more cost-effective than a 24/7 asymmetrical one.

Let's take a scenario where a QA team member needs to run a hot fix test in production. It goes without saying that he requires a production-like environment to do this. Why shouldn't he be able to build the infrastructure as needed, conduct the tests, gather the data, and the underlying infrastructure will be thrown away afterwards.

Similar situations can be raised in relation to developers, business processes, and other members. Crossplane, which runs natively on Kubernetes and benefits from Kube-API and the underlying RBAC, is the tool to use at the moment.

Development teams can gain several advantages from an infrastructure controlled by a central control plane and deployed using pipelines. The ability to create infrastructure with predetermined time to live (TTL) and role-based access control is one advantage of such a system (RBAC). This implies that access to infrastructure can be restricted using roles and that infrastructure can be automatically built and deleted depending on predefined rules. This can guarantee that access to infrastructure is appropriately restricted and that it is only used when it is required.

The ability to readily monitor such an infrastructure is another advantage. Using a central control plane makes it possible to keep an eye on the infrastructure's health and track changes over time. Teams can use this to guarantee that the infrastructure operates as intended and to identify and troubleshoot problems more effectively.

In general, development teams can gain from a central control plane that maintains infrastructure through pipelines and permits the building of infrastructure with predefined TTL and RBAC, including increased productivity, security, and observability.

Having a proper procedure that adheres to the GitOps principles makes it much easier to avoid situations when people without a good understanding can access the cluster consoles' online user interface and cause infrastructure inconsistencies. The infrastructure DevOps can now focus on delivering high-quality IDPs (internal development platforms), which can reduce the time spent on change and deployment ceremonies.

Using a single interface, Crossplane, an open-source multi-Cloud control plane, enables customers to automate and manage infrastructure across various Cloud providers. Extending the Kubernetes API with the concept of Custom Resource Definitions (CRD) allows users to define their own custom resources, which can stand in for Cloud-specific resources like virtual machines or databases.

Crossplane makes use of the eXtended Resource Definitions (XRD) API to let users define their own unique resources in a way that is decoupled from the implementations of particular Cloud service providers. This means that users can use Crossplane to deploy and operate their preferred infrastructure on any supported cloud platform after defining it in a Cloud-agnostic manner.

One notable difference between Crossplane and Terraform is that whereas Terraform is a standalone program that can be used in any context, Crossplane is integrated with the Kubernetes API and is intended to be used in a Kubernetes environment. This means that whereas Terraform is more concerned with infrastructure management, Crossplane may use of Kubernetes' features and capabilities, such as containerization and orchestration.

Componentele-Crossplane-2-1.png

Using XR (Extended Resource) and XRD (eXtended Resource Definitions) objects, the API used by Crossplane allows users to define their own custom resources in a more specific form from the Cloud provider implementations. This means that users can define their desired infrastructure in a Cloud-agnostic way and then use Crossplane to deploy and manage that infrastructure on any supported Cloud platform.

Scenario: We will approach the architecture in two stages, the first being the Cloud-agnostic infrastructure that will facilitate the solution of the second, client-controlled, through which various applications, in this example, WordPress, can be easily deployed.

Arhitectura-scenariului-1-1.png

First stage:

Through GitHub Actions and Terraform we will automate the creation of an ArgoCD instance, which will easily install Crossplane. From here, we can choose, according to the customer's preference, any public Cloud service provider where we will have a Kubernetes cluster deployed.

Second stage:

The client-controlled part is hosted by the Kubernetes cluster mentioned above, with an ArgoCD instance through which the WordPress application is deployed.

Building an application that is 100% Cloud-native will be feature-proof as long as it is Cloud-agnostic, so the developer teams can focus on the quality of the work they have to deliver.


Join our community of problem-solvers