ControllerMesh — Empowering Effortless Operator Management

KusionStack
5 min readOct 24, 2023

As Kubernetes developers, we utilize Operators to extend capabilities and automate the deployment and management of applications. However, as the scale of Kubernetes increases, it introduces numerous challenges for the stability and operation of Operators.

Problems and Challenges

An Operator instance is stateful. Therefore, an election mechanism is used to make sure that there is only one Pod to handle the reconciliation tasks caused by K8s resource change events at runtime. This election mode brings in single point of failure. For core Operators that demand high availability, such an approach is simply unacceptable. For details, it leads to following issues:

Stability issues:

  • High load on a single machine requires larger computation, storage, and networking resources at once.
  • The startup time increases due to the influence of listing resources across the entire cluster, and in some cases, it may even fail to start.
  • A high volume of change events initiated by part of resources can impact the entire reconciliation efficiency, making it challenging to ensure fairness.

Operation issues:

  • Inability to perform canary upgrades.
  • Inability to horizontally scale.

What we want

In order to address these issues, it is imperative for operators to possess the following capabilities:

Canary Upgrade. During the deployment process, both the new and old versions exist and are supposed to be isolated from each other. The new version of the Operator instance can be validated in a small scope and gradually expanded, enabling effective risk control.

Canary Upgrade

Horizontal Scaling. Operators should have the ability to partition the cluster horizontally into multiple partitions, where each partitioning is isolated from one another. Additionally, operators should be able to freely adjust the partitioning as necessary.

Horizontal Scaling

Controller-Mesh

To achieve these capabilities, we have taken inspiration from the Istio service mesh architecture and developed a ControllerMesh for Operators. Similarly, ControllerMesh is divided into a control plane and a data plane.

In the control plane, it handles configuration management and distribution. In the data plane, there is a group of proxy sidecars deployed alongside each Operator instance, responsible for handling communication with Kube ApiServer.

This architecture is transparent and non-intrusive to Operators. Any Operator built based on the community controller-runtime can easily integrate with it and supports cross-language functionality. Just like Istio, injecting containers and proxying requests can be achieved by a simple labeling process.

Controller-Mesh Architecture

With this architecture, we utilize sidecar proxies to intercept all requests, including ApiServer requests. This allows us to easily implement security measures for requests, such as circuit-breaking and rate limiting strategies. Additionally, it provides the capability to manipulate requests sent to the ApiServer effortlessly.

The proxy containers inject LabelSelector parameters into List&Watch requests to restrict the caching and event watching of K8s resources for each Operator instance. The ControllerMesh Manager automatically maintains some predefined labels and sends different rule configurations to each instance of the Operator, enabling the deployment of Operators in shards.

Configuration Example

All Operator-related configurations are implemented by Custom Resource Definitions (CRDs), and the Manager will watch for configuration change events and dynamically distribute them to the proxy containers.

ShardingConfig

By utilizing ShardingConfig, users can configure a basic sharding strategy with spec.root, which automatically partitions the entire K8s resources into several shards by namespaces and allows specifying a subset of namespaces for canary validation. Additionally, users can utilize spec.limits to customize more fine-grained strategies for each shard.

ShardingConfig demo

CircuitBreaker

In addition to the basic REST matching restriction policies, the circuit-breaking and rate-limiting configuration allows for specific limitations on the Operator’s resource operations. For example, it can restrict the deletion rate of Pods or apply a policy to circuit-break the deletion behavior of Pods.

CircuitBreaker demo

Rollout

When rolling out a K8s workload, it doesn’t take into consideration whether the Pod is a leader or not; it simply upgrades the Pods according to the predefined strategy. In the worst-case scenario, the number of primary-backup switches is equal to the number of replicas, which increases the instability during the deployment process.

After integrating with Controller-Mesh, users have control over the deployment process of Operators. They can specify which shards to deploy and even automate the deployment in shard order. Within each shard, non-leader Pods are prioritized for upgrades, ensuring that there is only one leader-backup switch during the deployment.

This shard-based progressive deployment strategy effectively controls the risk of releasing a new version and reduces the blast radius of anomalies.

Contributions Welcomed

Through transparent request proxying, we can easily accomplish many things, such as fault injection and observability capabilities, which are also part of our future plans. As the project is still in its early stages, we warmly welcome fellow enthusiasts to join the community development and contribute their valuable suggestions!

Welcome to star⭐️ ️and join us:

ControllerMesh: https://github.com/KusionStack/controller-mesh

Operating: https://github.com/KusionStack/operating

Website: https://kusionstack.io

--

--

https://kusionstack.io/ - Open Tech Stack to build self-service, collaborative, reliable and sustainable Internal Developer Platform.