A declarative GitOps approach to telecom software deployment - Ericsson

2023-02-28 14:10:54 By : Ms. Lin Li

A declarative, GitOps-based approach to software delivery and deployment enables communication service providers to increase automation and speed up the introduction of new features and updates – including security fixes – into their networks.

Highly automated deployment of new network and BSS/OSS functions would be a game changer for the telecom industry, enabling communication service providers to continuously optimize security, maximize customer satisfaction and seize on new and emerging business opportunities without delay. Our research indicates that the most promising way to achieve the necessary automation of telecom software (SW) delivery and operation processes is through the adoption of a declarative, GitOps-based approach.

This article outlines the evolution from today’s imperative pipelines that automate individual procedures to a holistic declarative approach for delivery and deployment automation and network management. To separate the concerns between network-function operations and their realization, our concept separates life-cycle management on the functional and realization domains and introduces new event-driven automation on top of the declarative deployment functionality.

Authors: Peter Wörndle, Stephen Terrill, Torsten Dinsing

API – Application Programming Interface BSS – Business Support Systems CaaS – Containers-as-a-Service CI/CD – Continuous Integration/Continuous Deployment CNA – Cloud-native Application CNF – Cloud-native Network Function CSP – Communication Service Provider E2E – End-to-End LCM – Life-cycle Management NF – Network Function OSS – Operations Support Systems SW – Software

A declarative, GitOps-based approach to software delivery and deployment enables communication service providers to increase automation and speed up the introduction of new features and updates – including security fixes – into their networks.

An always-up-to-date software (SW) base will be a huge advantage for communication service providers (CSPs) by making it possible to rapidly introduce new features and updates that open up new business opportunities, improve efficiency and proactively address security threats. To make it happen, greater automation of SW delivery and operational processes will be essential.

Current telecom SW delivery and operation processes – in which even the smallest updates must undergo the same semi-frequent, complex processes as large packages that deliver new functionality – are simply too complicated, manual and time-consuming to support an always-up-to-date SW base. To ensure efficient delivery and deployment of network and operations support systems/business support systems (OSS/BSS) functions in the future, CSPs need to evolve their processes and tools to fit the new application architectures and incorporate the use of new technologies.

At Ericsson, we believe that the most effective approach to achieving the necessary automation of telecom SW delivery and operation processes is by adopting continuous integration/continuous deployment (CI/CD) best practices for SW delivery pipelines. Further, we recommend a microservice-based architecture of componentized cloud-native network functions (CNFs), which enables more targeted SW modifications, while also making it possible to maintain a larger number of SW artifacts. Evolving the application architecture and the underlying platform in this way also opens up the opportunity to use cloud-native best practices for life-cycle management (LCM).

With a DevOps mindset as our starting point, the automation efforts we present in this article focus on the implementation of a continuous SW flow and the creation of feedback loops between organizations that develop SW and those that operate it. Our concept separates LCM on the functional and realization domains and introduces new automation on top of the declarative deployment functionality, which makes it possible to separate the concerns between the network function (NF) operations and their realization. This approach is already widely adopted in various IT domains and ecosystems and therefore provides a solid baseline for adoption in the telecom industry.

GitOps is a way of working that applies best practices in development to software automation. It uses a declarative approach that forms the basis of continuous everything. The main benefits of GitOps are that it:  

As an emerging best practice, DevOps is starting to replace existing processes. During the past decade, there has been a technology evolution among CSPs from physical NFs toward virtual NFs, CNFs and cloud-native applications (CNAs). This has brought with it new LCM characteristics that have enabled and driven a coevolution from existing ways of working [1] to new practices. Most notably, development teams are now able to observe SW performance in live systems and quickly update CNAs in the case of unwanted behavior [2].

When CNAs are developed according to the 12-factor application principles [3], microservice in-service SW updates and the LCM of the underlying Kubernetes cluster and infrastructure do not impact the service provided by the running NFs or applications. This is because, in the cloud-native paradigm, a product is composed of microservices that each have their own independent life cycles. As a logical consequence of this, the life cycle of the CNA becomes decoupled from the life cycle of the service it provides. The service will continue while the underlying deployed SW can evolve and change.

Each microservice in a CNA is represented by a number of accompanying artifacts such as the container image, helm charts, Flux manifests and a deployment configuration in the form of values.yaml. These artifacts make it possible to deploy a microservice initially as part of a CNF. The same mechanism can be applied to post-deployment configuration, which is commonly referred to as Day-1/n configuration. The relationships and customizations of these artifacts can be formulated in a declarative description that explains what should be deployed in a Kubernetes cluster.

There are two key characteristics that enable the evolution of DevOps best practices and CI/CD pipelines to new GitOps ways of working:

The separation between the realization domain (how the service is realized) and the functional domain (the service itself) that is illustrated in Figure  1 makes it possible to decouple their life cycles and therefore also decouple the life-cycle automation between them. Most importantly, this means that the SW life-cycle automation in the realization domain becomes a more generic issue that can be solved with applicable tooling from the cloud-native ecosystem. (Note that this approach is equally applicable to NFs and applications from OSS/BSS, which do not differ on the technology side.)

Figure 1: Separation of functional management from realization management

The functional domain is responsible for modeling and managing the functionality of a network service, determining which functions will be necessary to provide it. The realization domain, on the other hand, is responsible for the LCM of the artifacts that realize the functional components. Examples of artifacts in the realization domain are container images, Helm charts, Kubernetes manifests and configuration artifacts. The realization domain also manages the resources that the artifacts need in terms of compute, memory, storage and networking.

Figure 2 depicts our proposal for an architecture based on the separation of realization and functional domain concerns. The demarcation between the domains is represented by the runtime repository, which contains the desired system state for the realization domain. The deployment operator ensures that the desired system state is applied to the target system: a Kubernetes cluster, a CNF or any other application. It enables application deployment through automatic reconciliation between the declarative description (desired deployment) and the actual deployment on the infrastructure, so that both are in constant sync.

Figure 2: Target architecture for declarative deployments

The role of management, orchestration and assurance tools in this architecture is to determine the optimal system desired state based on the desired state of the entire network supporting the network services, optimization targets and the present network state. Whenever a change of the desired system state is needed, the tools ensure a syntactically and semantically correct representation of the new desired state and store it into the runtime repository.

The architecture also establishes a clear demarcation line between vendor deliverables and the functional and realization domains for operations. The main role of the SW supplier is to establish a constant stream of updated and validated combinations of artifacts into the CSP artifact repositories, enabling the CSP to access the latest NF SW at any time. Delivery pipelines and onboarding services ensure the authenticity and integrity of the onboarded artifacts and allow the service delivery organization(s) to supply complementary artifacts.

The onboarding pipeline is triggered by the availability of new or changed artifacts in the artifact repository and facilitates their processing in the management, orchestration and assurance tools, or alternatively, makes them available in the runtime repository directly.

It is the responsibility of the management and orchestration system to manage the deployed services and resources over their entire life cycle, ensuring that they meet the necessary service requirements efficiently. As shown in Figure 1, in a declarative management model the functional LCM is separated from the realization LCM. In this scenario, the management and orchestration system is responsible for the functional LCM and provides the realization management with the desired state of the deployed system.

The management and orchestration tools gradually resolve a high-level service model or service intent, which can be declaratively described, until a realizable desired state is reached. This desired state is then pushed into a runtime repository, from which the deployment operator reconciles the state and starts modifying the target systems to reach the desired state. The runtime repository reflects the demarcation between the management aspect of the functional domain and the realization domain.

As deployment operators act on specific target systems with limited context, the management and orchestration tools need to determine context information, such as placement or initial size and size limits of a CNF, and encode this information into the desired state.

The dynamic nature of telecom systems requires constant reevaluation of the desired system state against the key operational characteristics of the network. With the support of the assurance systems, the management and orchestration tools provide the control loop that evaluates and determines the desired state. To support this, the management and orchestration system places policies into the repositories and deployment operator to be notified in cases such as when:

There are two separate repositories in our proposed architecture: the artifact repository and the runtime repository.

An artifact repository is a harmonized landing point for vendor artifacts that continuously receives all the latest releases. Depending on the type of artifacts in question, different repository types are used, including Open Container Initiative-compliant registries (to store container images and Helm charts), Git repositories (to store text-based artifacts) and object stores (to store arbitrary larger binary files).

Vendor-specific delivery pipelines are required to adapt to the vendor’s delivery process. A CSP may use different instances of the same artifact repository implementation for different vendors or purposes. Meanwhile, a vendor can make different artifacts versions available to CSPs, such as offering prerelease SW for early trials or testing purposes. Tagging features in the artifact repositories can be used to differentiate between the different versions.

In addition to artifacts released by a vendor’s research and development CI/CD process, artifacts created as part of an individual customer project are typically used in a delivery process as well. The artifact repositories can also serve as a landing point for predefined configuration files and customer-specific adaptations, for example. To avoid the addition of complex manual processes, it is possible to automate customization and configuration generation efforts as part of the delivery process, represented by separate CI/CD pipelines.

A runtime repository contains the desired state for the systems in the network. The deployment operator continuously monitors the contents of the runtime repositories and triggers actions if the target system does not reflect the desired state. This is called a reconciliation process.

The desired state is encoded in a declarative format – that is, it describes how the systems should be deployed and configured. The desired state can apply to areas including:

The deployment operators consuming these state representations do not have to be specific to the area but are specific to the system on which they actuate. For example, a deployment operator actuating on a Kubernetes cluster can consume information for any of the three areas, given that all the areas can be configured using the Kubernetes application programming interface (API).

The runtime repository is version-controlled to provide a change history for the entire system and to allow for auditing. The declarative nature of the approach makes it possible to simplify recovery use cases by restoring an earlier version and then letting the system reconcile to the state described in the earlier version.

Often, a single repository is used to provide a single source of truth for the entire desired state of the network from a single system. Depending on the operational needs, multiple runtime repositories are sometimes used to separate concerns between the areas. In this case, the collection of all runtime repositories represents the single source of truth.

Git is a prominent implementation for a runtime repository that provides version control and allows for a multiuser workflow. Today’s Git solutions such as GitLab or GitHub provide sophisticated workflows on top of the basic Git operations, enabling review, testing and approval flows. Due to its popularity as a runtime repository, a large ecosystem of agents, controllers and other tools has developed around Git. Operations that use Git as a single source of truth in combination with agents and controllers are referred to as GitOps.

The reconciliation of the information stored in the runtime repository with the target system is a critical function in a declarative system. The deployment operator realizes this task by executing a control loop that compares the information in the runtime repository with the actual system state and by continuously working to modify the system state to match the single source of truth.

There are two reasons why a mismatch between the desired system state in the runtime repository and actual system state can occur:

In either case, the deployment operator will take corrective actions to align the actual system state with the desired state. As the corrective actions are specific to the target system, deployment operators are often designated to a specific target system.

Flux [4] is an example of a deployment operator that operates on one or more Kubernetes clusters as a target system. It can use several types of sources but is mostly used in combination with a Git repository as a runtime repository. Flux can actuate on any object that can be described as a Kubernetes object.

Kubernetes is a prime example of a management system that follows a declarative model. The Kubernetes API uses the Kubernetes Resource Model [6] to describe any resource accessible through the Kubernetes API server. The declarative resource descriptions are stored in the internal Kubernetes state database.   Kubernetes controllers monitor specific resource types in a continuous loop. Whenever the desired resource state in the database is changed, the controllers act by creating or terminating pods or by modifying network policies. Custom Resources Kubernetes makes it possible to introduce new declarative resource descriptions for arbitrary objects. Custom controllers, or Kubernetes operators, monitor the custom resources and take arbitrary actions based on the information stored in the desired resource state.

The declarative, repository-based management approach provides efficient automation for configuration and SW handling of a target system. In many deployments, the approach is enhanced by further automation capabilities to facilitate review, security scanning, approval processes, health checks and testing at different stages. Figure 3 illustrates our concept for a declarative delivery and deployment automation flow.

Figure 3: Declarative delivery and deployment automation flow

Compared to the current imperative end-to-end (E2E) pipelines that describe long processes, a repository-based management approach enables smaller imperative actions (automation jobs) that can be triggered by different events in the orchestration and SW LCM flow. Some automation jobs will be generic to the target system, such as a security-scanning job that is triggered when new SW artifacts become available. Other automation jobs may be specific to a target system, such as a health-check job for a specific function. Smaller imperative automation jobs specific to a target system can be provided by the software supplier while others specific to the CSP’s operational procedures will be developed by them.

Health checks are also used at different stages in the SW life-cycle process, such as checking system health before modifying the desired state and verifying system health after reconciliation. In the first case, the orchestration system issues the event that triggers the health check job prior to modifying the runtime repository. In the second case, the health-check triggering event is issued by the deployment operator that successfully completes the reconciliation.

Events can trigger different types of deployment automation jobs or chains of automation jobs (pipelines). Event-to-action resolution is required to address the CSP’s operational needs with respect to flexibility and configurability. CDEvents [5], a common specification that provides a standard way to describe Continuous Delivery events, is a powerful enabler for flexible event systems.

Further input to the management and orchestration system comes from the network and OSS/BSS functions in the form of logs and performance data. This input can be used to further adjust the declarative description of the desired state to meet expectations of the functional domain. This data may potentially be shared with the SW vendor in an anonymized form as part of a DevOps feedback loop.

A common pattern in a repository-based management is to apply a multistage deployment or rollout process. Each stage relies on the same basic architecture with a single source of truth. If the same repository is used for several stages, replicating configurations between environments can become as simple as copying files from one folder to another. Typically, the promotion in between stages is automated and follows a dedicated approval process. Like other DevOps processes, every stage covers different aspects of the rollout, such as testing, for example. It is crucial that the further the process advances, the closer the environment is to the production environment in terms of hardware, surrounding applications, SW version and so on.

At present, imperative pipelines are often used to automate either all or parts of delivery and deployment processes. As a first step in transitioning to a declarative approach, we recommend the modularization of existing E2E pipelines into reusable automation jobs in order to gain the flexibility needed to introduce a new deployment mechanism. The second step we recommend is to replace the existing deployment stage with a declarative approach by introducing a GitOps solution. Modularization in combination with the GitOps-based deployment enables a loose coupling between onboarding, deployment and validation automation and allows for the reuse of automation jobs in the different pipeline fragments.

Using the modularization, the onboarding pipeline stages in E2E pipelines are split into two main fragments: the ingestion of artifacts into artifact repositories and the triggering of system-specific onboarding actions. While the former is often specific to the vendor deliverables and packaging formats, the latter is specific to the target management systems. The modularization of imperative E2E pipelines and the triggering of jobs based on events and explicit actions increase the reusability of pipeline fragments and simplify the adaptation to different sets of components of the management system.

Continuous service improvements and innovation are essential to help communication service providers (CSPs) stay competitive in an evolving landscape. The ever-expanding Cloud Native Computing Foundation ecosystem characterized by cloud-native applications and evolving life-cycle management (LCM) practices/tooling demands a change in the management approach for network services and functions both in terms of tools and operational flows. CSPs must, therefore, move toward continuous deployment, where each updated microservice is validated in each staging step. A split of the management space into functional and realization domains makes it possible to separate the life cycles of the two domains, as well as their paces.

The introduction of a declarative runtime repository creates a clear demarcation between the domains. The separation enables CSPs to adopt new LCM practices, such as GitOps, and tap into the wide ecosystem of generic software life-cycle automation capabilities in the cloud-native realization domain. CSPs can independently build fit-for-purpose automation in line with their business processes and needs in the functional domain, and adopt a life cycle that matches the needs of their businesses, focusing on the key values and characteristics that their networks provide to customers.

The modular pipeline concept, with deployment automation jobs triggered by events from both domains, complements the declarative management approach and allows CSPs to continue to benefit from imperative pipelines matched to their process needs. This approach allows them to evolve their internal business processes and automation independently from the underlying software LCM.

Subscribe to Ericsson Technology Review

CI/CD is the cornerstone of the telecom transformation. With CI/CD, delivery processes for new software versions and services can be automated – dramatically improving time-to-market and service agility. Find out more about CI/CD and the benefits it can create for your business.

We are leveraging our expertise to automate 100% of our 5G offering. Fully automated software pipelines are the future, and as it requires both technological and organizational changes, the best time to start is now.

is a senior expert in deployment architectures whose work focuses on the use of cloud and infrastructure technologies in different types of management ecosystems. He joined Ericsson in 2007 while still at university. Since then, he has held several positions in R&D within the area of virtualization and cloud. Wörndle holds a M.Sc. (Dipl.-Ing.) in electrical engineering and information technology from RWTH Aachen University in Germany.

is a senior expert and chief architect in automation and management at Ericsson. Since joining the company in 1994, he has worked primarily in telecommunications architecture, implementation and industry engagement. In recent years, his work has focused on the automation and evolution of OSS. Terrill holds an M.Eng.Sc. from the University of Melbourne, Australia.

joined Ericsson in 2000 and currently serves as a senior expert in service architecture at Group Function Technology and Strategy. In recent years, his work has focused on the interface between R&D and CSP organizations and the application of new practices such as DevOps, CI/CD and GitOps for LCM and orchestration. Dinsing holds an M.Sc. (Dipl.-Ing.) in electrical engineering from RWTH Aachen University.

Phone: +1 972 583 0000 (General Inquiry) Phone: +1 866 374 2272 (HR Inquiry) Email: U.S. Inside Sales

Modern Slavery Statement | Privacy | Legal | Cookies  | © Telefonaktiebolaget LM Ericsson 1994-2023

You seem to be using an old web browser. To experience www.ericsson.com in the best way, please upgrade to another browser e.g., Edge Chromium, Google Chrome or Firefox.