Build a Kubernetes Controller to Watch CRD Changes

Build a Kubernetes Controller to Watch CRD Changes
controller to watch for changes to crd

In the rapidly evolving landscape of cloud-native computing, Kubernetes has firmly established itself as the de facto operating system for the data center, orchestrating containers and managing distributed applications with unparalleled efficiency. Yet, even with its extensive set of built-in resources—such as Deployments, Services, and Pods—Kubernetes cannot possibly anticipate every unique application requirement or operational pattern across the myriad industries it serves. This inherent limitation necessitates a powerful extensibility mechanism, allowing users to tailor and augment Kubernetes to their specific needs, transforming it from a general-purpose orchestrator into a bespoke platform for their workloads.

This quest for extensibility is precisely where Custom Resource Definitions (CRDs) and the associated Kubernetes Controllers come into play. CRDs provide a method to extend the Kubernetes API with custom resources, allowing users to define new types of objects that represent their domain-specific concepts, much like built-in resources. However, defining a new resource is only half the battle; these new resources, once created, need active management. This is the domain of Kubernetes Controllers, which continuously observe the state of these custom resources, comparing them against a desired state, and taking corrective actions to reconcile any discrepancies.

This article embarks on an exhaustive journey to explore the intricate process of building a Kubernetes Controller specifically designed to watch for changes in Custom Resource Definitions. We will delve deep into the foundational concepts, dissect the architecture of a robust controller, guide you through the practical implementation using popular tools and libraries, and arm you with advanced strategies and best practices. By the end of this comprehensive guide, you will possess a profound understanding of how to extend Kubernetes with your own application-specific logic, enabling powerful automation, enhancing operational efficiency, and unlocking a new dimension of cloud-native development. This mastery will empower you to create self-managing systems that not only deploy but also intelligently operate your applications, adapting to changes and maintaining their desired state autonomously, thereby significantly simplifying the management of complex, distributed architectures.

Part 1: Understanding the Cloud-Native Landscape and Kubernetes Extensibility

Kubernetes, often lauded as the "operating system for the cloud," fundamentally changed how developers and operations teams deploy, scale, and manage applications. Its declarative API, robust scheduling capabilities, and self-healing properties provide an incredibly powerful foundation. However, the true strength of Kubernetes lies not just in its default capabilities but in its profound extensibility.

Kubernetes as an Operating System for the Cloud

At its core, Kubernetes provides a robust framework for managing containerized workloads and services. It abstracts away the underlying infrastructure, allowing developers to focus on application logic rather than the complexities of distributed systems. Its declarative model means users describe the desired state of their applications, and Kubernetes tirelessly works to achieve and maintain that state. This involves intricate coordination of pods, deployments, services, volumes, and network policies across a cluster of machines. The Kubernetes API server acts as the central control plane, receiving and processing all requests, making it the single source of truth for the cluster's state.

The beauty of Kubernetes is that it doesn't try to be a monolithic solution for everything. Instead, it offers powerful primitives and a well-defined API that can be extended and customized. This modularity is crucial for handling the vast and diverse requirements of modern applications, which often span multiple domains and technologies. Without extensibility, Kubernetes would quickly become a bottleneck, forcing users to fit their unique problems into its predefined molds, which is rarely optimal.

The Need for Extensibility: Why Built-in Resources Aren't Enough

While Kubernetes provides a rich set of built-in resources for common patterns—like Deployment for managing stateless applications, StatefulSet for stateful ones, and Service for network access—these resources are designed for general-purpose use cases. They are incredibly effective for the vast majority of standard microservices architectures. However, real-world applications often involve highly specialized components, complex operational logic, or domain-specific concepts that don't map cleanly to these generic primitives.

Consider an application that requires a specific type of database cluster, perhaps with unique backup, recovery, or scaling procedures. Or imagine an AI/ML workload that needs specialized hardware, data pipelines, and model deployment strategies. While you could orchestrate these components using a combination of existing Deployments and Services, the operational burden of managing their lifecycle, health, and upgrades manually would be significant. You'd be responsible for writing imperative scripts, monitoring disparate resources, and reacting to changes, effectively circumventing Kubernetes' declarative and self-healing principles.

This is precisely where the need for extensibility arises. When the operational logic for a component becomes complex and integral to the application's lifecycle, it becomes advantageous to embed this logic directly into Kubernetes itself. This means treating your application's unique components as first-class citizens within the Kubernetes ecosystem, managed by the same declarative API and control loop paradigms that govern native Kubernetes resources.

Introduction to Custom Resources (CRs) and Custom Resource Definitions (CRDs)

Custom Resources (CRs) are extensions of the Kubernetes API that are not necessarily available in a default Kubernetes installation. They represent instances of your custom object types. Think of a CR as an instance of a database cluster, a machine learning model, or a custom messaging queue – a unique entity that your application needs to function, but which Kubernetes doesn't natively understand.

To make Kubernetes understand a new custom resource, you first need to define its schema and capabilities. This is done through a Custom Resource Definition (CRD). A CRD is itself a Kubernetes resource that defines a new kind of resource in the Kubernetes API. When you create a CRD, you're essentially telling the Kubernetes API server: "Hey, I'm introducing a new type of object with this name, these fields, and this validation schema. Please make it available."

CRD Structure and Schema Validation

A CRD manifest typically specifies:

  • apiVersion and kind: Standard Kubernetes API object identifiers. For CRDs, apiVersion is usually apiextensions.k8s.io/v1 and kind is CustomResourceDefinition.
  • metadata: Standard object metadata (name).
  • spec: This is the most crucial part, defining the behavior and schema of your new custom resource.
    • group: The API group your custom resource belongs to (e.g., stable.example.com). This helps organize and avoid naming conflicts.
    • version: The version of your custom resource (e.g., v1alpha1, v1). CRDs can support multiple versions.
    • namespaced or scope: Whether your custom resource instances are namespace-scoped or cluster-scoped.
    • names: Defines the singular, plural, short names, and kind for your custom resource, used for API endpoints and kubectl commands.
    • versions: An array allowing you to define multiple versions of your CRD, each with its own schema. One version is marked as served and another as storage.
    • schema (openAPIV3Schema): This is the backbone for validating your custom resources. It uses an OpenAPI v3 schema to define the structure, data types, required fields, and constraints for the .spec and .status fields of your custom resources. This ensures that any custom resource created conforms to the expected format, preventing malformed objects and improving data integrity. For example, you can specify that a field must be an integer within a certain range or a string matching a particular regular expression.
    • subresources: Allows defining /status and /scale subresources, which are important for efficiency and standardized scaling practices. The /status subresource allows controllers to update the status of a CR without incrementing the object's generation, preventing unnecessary reconciliation loops.
    • conversion: Defines how objects of one version of a CRD are converted to another version, especially important when deprecating or introducing new API versions.

When you create a CRD, Kubernetes automatically generates API endpoints for it. For instance, if your CRD is named myapp.stable.example.com with kind MyApp, you can then create instances of MyApp using kubectl create -f myapp-instance.yaml or kubectl get myapps. The API server will store these custom resources in its etcd database, just like any other Kubernetes object.

How CRDs Extend the Kubernetes API

CRDs fundamentally extend the Kubernetes API in several ways:

  1. New Object Types: They introduce new, first-class object types into the API. This means you can interact with your custom resources using standard kubectl commands (e.g., kubectl get myapp, kubectl describe myapp) and client libraries, just as you would with native resources.
  2. Declarative Management: Your custom resources become part of the declarative desired state of your cluster. You define what you want, and Kubernetes works to make it happen.
  3. Schema Enforcement: The OpenAPI v3 schema specified in the CRD ensures that custom resources adhere to a predefined structure, providing strong data validation and consistency.
  4. Integration with Kubernetes Ecosystem: Custom resources can be referenced by other Kubernetes resources (e.g., a Deployment might reference a MyApp resource), and they can trigger events that other Kubernetes components (like controllers) can react to.
  5. Extensible Client Libraries: Client libraries (like client-go in Go) can dynamically generate code for your custom resources, making it easy to interact with them programmatically.

The Role of Controllers in Kubernetes

While CRDs provide the blueprint for new custom resources, they are passive definitions. They tell Kubernetes what a new resource looks like, but not how to manage it. This is where Kubernetes Controllers come in. A controller is a control loop that watches the shared state of the cluster through the API server and makes changes attempting to move the current state towards the desired state.

The Control Loop / Reconciliation Loop Concept

The core principle behind all Kubernetes controllers is the "control loop," often referred to as the "reconciliation loop." This loop continuously performs three fundamental steps:

  1. Observe: The controller monitors a specific set of resources in the Kubernetes API server for changes. This includes watching for new creations, updates, or deletions of resources it is responsible for.
  2. Analyze: When a change is detected, or at regular intervals, the controller retrieves the current state of the relevant resources from the API server. It then compares this "current state" with the "desired state" defined in the resource's specification.
  3. Act: If a discrepancy is found between the current and desired states, the controller takes corrective actions. These actions might involve creating new resources (e.g., a Deployment), updating existing ones (e.g., scaling replicas), deleting obsolete resources, or interacting with external systems. The goal is always to bring the current state into alignment with the desired state.

This loop runs continuously, providing the self-healing and automation capabilities that are hallmarks of Kubernetes. It's an eventually consistent system: given enough time and no new changes, the current state will converge to the desired state.

Desired State vs. Current State

Understanding the distinction between desired state and current state is critical:

  • Desired State: This is what the user wants the system to look like. In Kubernetes, this is explicitly defined in the spec field of a resource (e.g., spec.replicas: 3 for a Deployment, or spec.image: "my-app:v1.0" for a custom MyApp resource). It's a declarative statement of intent.
  • Current State: This is the actual, observable state of the system at any given moment. This includes the number of running pods, their IPs, the status of a deployment, or the current version of an external database managed by a controller. Controllers often report the current state back into the status field of a resource (e.g., status.availableReplicas: 2).

The controller's job is to bridge the gap between these two states. If the desired state says "3 replicas" but the current state shows "2 replicas," the controller acts to create one more pod. If the desired state says "delete this resource," the controller ensures all associated child resources are also removed.

Native Kubernetes Controllers (e.g., Deployment Controller)

To illustrate, consider the Deployment controller, one of the most fundamental native controllers in Kubernetes.

  1. Observe: It watches Deployment objects. When a user creates or updates a Deployment manifest, the API server saves it, and the Deployment controller is notified.
  2. Analyze: If a Deployment requests 3 replicas, but the current state (as reflected by ReplicaSet objects and their associated Pods) shows only 2 running pods, a discrepancy exists.
  3. Act: The Deployment controller then creates or updates a ReplicaSet to ensure 3 pods are running. The ReplicaSet controller, in turn, watches for ReplicaSet changes and creates/deletes Pods to match its desired replica count.

This chain of controllers, each managing a specific resource type and working towards a desired state, is what makes Kubernetes so powerful and resilient. When you build a controller to watch your CRD changes, you are essentially creating your own custom control loop, adding another link to this powerful chain, allowing Kubernetes to manage your unique application concepts with the same robustness as its native resources.

Part 2: Deconstructing the Kubernetes Controller Pattern

Building a Kubernetes controller might seem daunting at first, but it follows a well-established pattern built upon a few core components and concepts. Understanding these building blocks is crucial for developing robust and efficient controllers that seamlessly integrate with the Kubernetes ecosystem.

Core Components of a Controller

Every Kubernetes controller, whether native or custom, relies on a set of fundamental interactions and mechanisms to fulfill its reconciliation duties. These include communicating with the API server, efficiently observing resource changes, caching data for performance, and processing events in an orderly fashion.

API Server Interaction: How Controllers Communicate with the Kubernetes API

The Kubernetes API server is the brain of the cluster, serving as the central hub for all communication and state management. Controllers interact with the API server primarily in two ways:

  1. Listing and Getting Resources: Controllers need to read the current state of resources. This involves performing GET requests for individual resources (e.g., a specific Pod) or LIST requests to fetch collections of resources (e.g., all Pods in a namespace). While direct listing is possible, it's generally inefficient for continuous observation, as it would barrage the API server with requests.
  2. Watching Resources: This is the primary and most efficient method for controllers to observe changes. The Kubernetes API server supports a WATCH endpoint for every resource type. When a controller establishes a watch, the API server streams events (Add, Update, Delete) to the controller whenever a change occurs to the watched resource type. This push-based mechanism is far more efficient than polling.
  3. Creating, Updating, and Deleting Resources: To reconcile the desired state with the current state, controllers must be able to modify resources. This includes creating new child resources (e.g., a Deployment controller creating a ReplicaSet), updating existing ones (e.g., scaling a Deployment), or deleting obsolete resources. These operations are performed via POST, PUT, or DELETE requests to the API server.

All these interactions are typically handled through client libraries, abstracting away the underlying HTTP calls and JSON serialization. For Go-based controllers, client-go is the foundational library.

Informers: Efficiently Watching Resource Changes

Directly watching resources via the API server's WATCH endpoint is efficient, but building a production-grade controller requires more sophistication. Controllers often need a local cache of resources to perform fast lookups without repeatedly querying the API server. This is where Informers come in.

An Informer is a pattern and a component within client-go that combines watching and caching to provide an efficient and event-driven way for controllers to receive updates about resources.

  • Watch Mechanism vs. List/Get: Instead of continuous List/Get calls, Informers start by performing an initial LIST request to populate a local cache. After this initial synchronization, they establish a WATCH connection to the API server.
  • Shared Informers and Event Handlers (Add, Update, Delete): A SharedInformer listens to the stream of WATCH events. When an event arrives (a resource is Added, Updated, or Deleted), the Informer first updates its local cache and then invokes registered event handlers.
    • AddFunc: Called when a new resource is created.
    • UpdateFunc: Called when an existing resource is modified.
    • DeleteFunc: Called when a resource is removed. These handlers are where your controller's logic typically starts processing an event.
  • Delta FIFO Queue: Internally, Informers use a DeltaFIFO queue to manage events. This queue ensures that events are processed in order and that for any given object, the most recent state is processed, preventing stale data from causing issues. It also helps in handling network partitions or API server restarts by ensuring that no events are lost. The DeltaFIFO also deduplicates events, presenting a cleaner stream to the controller's logic.

Informers are critical for performance and robustness. They reduce the load on the API server by minimizing repeated LIST requests, and they provide a consistent, eventually consistent view of the cluster state through their local cache.

Listers: Caching Mechanism for Fast Read Access

Complementing Informers are Listers. A Lister is an interface provided by client-go that allows controllers to perform fast, read-only lookups against the local cache maintained by an Informer.

Instead of making an API call every time the controller needs to check the properties of a resource (e.g., "how many replicas does this Deployment have?"), it can query its Lister. This is incredibly fast because it's reading from local memory. Listers provide methods like Get() to retrieve a single object by name and List() to retrieve all objects of a certain type, often with label selectors.

The combination of Informers (for receiving events and updating the cache) and Listers (for reading from the cache) forms the foundation for efficient, event-driven interaction with the Kubernetes API.

Workqueue: Decoupling Event Handling from Reconciliation

While Informers and their event handlers detect changes, it's generally a bad practice to put complex, blocking reconciliation logic directly inside these handlers. Event handlers should be lightweight and fast, primarily responsible for enqueueing work. The actual reconciliation logic needs to be robust, handle errors, and potentially retry. This decoupling is achieved using a Workqueue.

A Workqueue (specifically, a RateLimitingQueue in client-go) is a thread-safe queue that processes items one at a time, ensuring that an object's reconciliation is not blocked by another, and that multiple reconciliations for the same object don't run concurrently.

  • Decoupling Event Handling from Reconciliation: When an Informer's event handler is triggered, instead of directly executing reconciliation logic, it simply adds the key of the affected object (e.g., namespace/name for a namespaced resource) to the Workqueue.
  • Worker Goroutines: A controller typically runs several worker goroutines. Each worker continuously pulls items from the Workqueue, processes them (runs the reconciliation logic), and then marks the item as done.
  • Rate Limiting and Backoff Strategies: The Workqueue is "rate-limiting" because it can be configured to delay retries for failed items. If a reconciliation attempt fails (e.g., due to a temporary network issue or an API server error), the item is put back into the queue with an exponential backoff. This prevents overwhelming the API server or constantly retrying a transient error too aggressively. This also helps in absorbing bursts of events by spreading the reconciliation over time.

This architecture ensures that reconciliation is robust, scalable, and doesn't block the event stream from the API server.

Reconciliation Loop: The Heart of the Controller

The actual reconciliation logic, which gets executed by a worker goroutine pulling an item from the Workqueue, is the core of your controller. It embodies the "observe, analyze, act" pattern.

A typical reconciliation loop for a specific object (identified by its key from the Workqueue) involves:

  1. Fetching the Resource: Using the Lister (or a direct Get from the API client if not cached) to retrieve the latest version of the custom resource specified by the key.
  2. Handling Deletion: Checking if the resource has a deletion timestamp. If so, and if finalizers are present, perform cleanup logic before removing the finalizer.
  3. Comparing Desired vs. Current State: Analyzing the spec of the custom resource and comparing it with the actual state of any child resources (e.g., Deployments, Services) or external systems it manages.
  4. Taking Action:
    • Creating: If a child resource is missing, create it.
    • Updating: If a child resource exists but its state doesn't match the desired state (e.g., wrong image, wrong replica count), update it.
    • Deleting: If a child resource is no longer desired (e.g., scaled down, removed from the CR's spec), delete it.
  5. Updating Status: After performing actions, update the status field of the custom resource to reflect the current state back to the user. This is crucial for observability.
  6. Error Handling and Retries: If any step fails, return an error from the reconciliation function. The Workqueue will then automatically re-enqueue the item with a delay, allowing the controller to retry.
  7. Idempotency: All actions taken by the controller must be idempotent. This means applying an action multiple times should have the same effect as applying it once. For example, when creating a Deployment, first check if it already exists. If it does, update it; otherwise, create it. This is crucial because controllers might process the same event multiple times due to retries or network issues.

The Operator Pattern: Controllers for Complex Applications

While controllers are great for managing individual resources, the Operator pattern extends this concept to manage entire applications. An Operator is a method of packaging, deploying, and managing a Kubernetes application. Kubernetes applications are both deployed on Kubernetes and managed using the Kubernetes API and kubectl tooling.

  • Beyond Basic Resource Management: Operators go beyond simply managing a few child resources. They encapsulate human operational knowledge about managing a specific application (e.g., a database, a message queue, or even an AI model serving system). This includes complex tasks like:
    • Lifecycle Management: Deploying, upgrading, scaling, and deleting applications.
    • Backup and Restore: Implementing application-specific backup and restore procedures.
    • Failure Recovery: Handling application-specific failures and automatically recovering.
    • Observability: Exposing application-specific metrics and logs.
    • Schema Migration: Managing database schema changes during upgrades.
    • Security Configuration: Managing user accounts, permissions, and network policies for the application.

Essentially, an Operator takes the intelligence of a human operator for a specific piece of software and encodes it into a controller that runs on Kubernetes. It bridges the gap between Kubernetes and application-specific knowledge, making complex stateful applications manageable and self-operating within a cloud-native environment. Many vendors provide Operators for their software (e.g., Elasticsearch Operator, Prometheus Operator, PostgreSQL Operator).

Choosing Your Toolkit

Building a controller from scratch involves interacting with the Kubernetes API, managing caches, and orchestrating work queues. Fortunately, several toolkits simplify this process, abstracting away much of the boilerplate.

client-go: The Fundamental Go Client Library

client-go is the official Go client library for interacting with the Kubernetes API. It provides the low-level primitives for:

  • REST Client: For making direct HTTP requests to the API server.
  • Typed Clients: Generated clients for native Kubernetes resources (e.g., core/v1, apps/v1) that provide Go structs for resources and methods like Get, List, Create, Update, Delete.
  • Dynamic Client: For interacting with arbitrary (including custom) resources without compile-time knowledge of their Go types.
  • Informers and Listers: The building blocks for watching and caching, as discussed earlier.
  • Workqueues: Rate-limiting queues for processing events.

While powerful, building a controller solely with client-go requires significant boilerplate code to wire up all these components correctly. It gives you maximum control but demands a deep understanding of the underlying mechanisms.

controller-runtime: Building Blocks for Controllers

controller-runtime is a library built on top of client-go that provides higher-level abstractions and utilities for building Kubernetes controllers. It significantly reduces the boilerplate and complexity. Its key features include:

  • Manager: A central component that coordinates all controllers, webhooks, and client interactions. It manages client connections, caches, and shared informers.
  • Client: A unified client interface that can perform Get, List, Create, Update, Delete operations on both native and custom resources, leveraging the manager's cache for read operations when possible.
  • Reconciler Interface: A simple interface (Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error)) that encapsulates the core reconciliation logic for a specific resource type.
  • Controller Builder: A fluent API for setting up controllers, specifying what resources to watch (Watches), what resources they own (Owns), and how to handle events. This automatically wires up informers, caches, and workqueues.
  • Webhook Builders: Tools for creating validating and mutating admission webhooks.

controller-runtime strikes a good balance between abstraction and flexibility, making it the preferred choice for many controller developers.

kubebuilder / operator-sdk: Scaffolding, Code Generation, Opinionated Frameworks

For an even more streamlined experience, kubebuilder (developed by the Kubernetes SIG API Machinery) and operator-sdk (from the Operator Framework, heavily leveraging kubebuilder) provide powerful command-line tools for scaffolding, code generation, and opinionated development workflows. They are designed to accelerate controller development significantly.

  • Scaffolding: They generate the basic project structure, main.go, Dockerfile, Makefile, and boilerplate for your CRD and controller.
  • Code Generation: They can automatically generate Go types for your CRDs from their OpenAPI schemas, making it easy to work with custom resources in Go. They also generate manifests for CRDs, RBAC roles, and deployments.
  • Opinionated Frameworks: They enforce best practices and a standardized project layout, which helps in maintaining consistency and makes it easier for others to understand and contribute to your controller.
  • CLI Tools: Provide commands for creating new APIs, adding webhooks, running tests, and deploying the controller to a cluster.

For most new controller projects, starting with kubebuilder or operator-sdk is highly recommended. They leverage controller-runtime under the hood, providing all its benefits while adding a powerful layer of automation and structure. This article will focus on using kubebuilder for its clarity and widespread adoption within the Kubernetes community.

Feature / Tool client-go controller-runtime kubebuilder / operator-sdk
Abstraction Level Low-level primitives Medium-level building blocks High-level framework with code generation
Primary Focus Direct API interaction, Informers, Workqueues Controller lifecycle management, reconciliation Scaffolding, CRD/Controller code generation, best practices
Boilerplate High, requires manual wiring Moderate, but significantly reduced Low, much is generated automatically
Learning Curve Steep, deep understanding of Kubernetes APIs Moderate, understanding core patterns Moderate, learn the tools and generated structure
Use Cases Advanced custom logic, specific API interactions Most general-purpose controllers Rapid controller development, Operator pattern
Recommended For Experts, specific library needs Experienced Go/K8s developers Most controller projects, especially new ones

Choosing the right toolkit depends on your familiarity with Go and Kubernetes, and the specific needs of your project. For building a controller to watch CRD changes, kubebuilder provides the most efficient and structured path forward.

Part 3: Designing Your Custom Resource Definition (CRD)

The success of your Kubernetes controller hinges significantly on the design of its Custom Resource Definition (CRD). A well-designed CRD is intuitive, robust, and correctly represents the domain concept you intend to manage. It acts as the contract between the user and your controller, defining what configurable parameters are available and what status information can be observed.

Use Case Identification: What Problem Does Your CRD Solve?

Before writing any code, the absolute first step is to clearly define the problem your CRD and controller are intended to solve. Without a clear use case, your CRD risks becoming overly complex, poorly structured, or simply redundant. Ask yourself:

  • What real-world application or system component am I trying to manage? Is it a database, a message queue, a data processing pipeline, a custom microservice, or a complex AI model deployment?
  • What are the key configurable parameters for this component? For example, a database might need version, storageSize, userCredentials. An AI model might need modelName, modelVersion, resourceRequests, inferenceEndpoint.
  • What is the desired state I want to declare? How would a user express their intent for this component?
  • What status information is important for users to see? How can they tell if the component is healthy, deployed correctly, or in a degraded state?
  • Are there existing Kubernetes resources that already solve this? If a Deployment and Service suffice, you might not need a CRD. The value of a CRD comes from encapsulating complex, domain-specific logic that native resources cannot handle.

Let's assume our use case is to manage a simple web application that consists of a Deployment and a Service. We want to provide a higher-level abstraction called MyApp that encapsulates these two resources, making it easier for application developers to deploy their apps without needing to understand the intricacies of Deployments and Services.

CRD Spec Design

With a clear use case in mind, we can now design the spec of our MyApp CRD. The spec field of a CRD is where you define the schema for your custom resource's desired state.

apiVersion, kind, metadata

These are standard Kubernetes object fields. For our custom resource, they would look something like:

apiVersion: stable.example.com/v1
kind: MyApp
metadata:
  name: my-web-app
  namespace: default

Here, stable.example.com/v1 is our custom API group and version, and MyApp is the kind of our custom resource.

spec Field: Desired State, Carefully Defined Schema

The spec of our MyApp CR will hold the declarative configuration for our web application. For our simple MyApp example, we might want to configure the following:

  • image: The Docker image for the web application (e.g., nginx:latest).
  • replicas: The desired number of pods for the application.
  • port: The port the application listens on.

So, a custom resource instance might look like this:

apiVersion: stable.example.com/v1
kind: MyApp
metadata:
  name: my-web-app
spec:
  image: "nginx:1.21.6"
  replicas: 3
  port: 80

Now, we need to define the schema for this spec within our CRD.

status Field: Current State Reported by Controller

The status field of a custom resource is where your controller reports the current state of the managed application. Users should not manually modify this field. For our MyApp, important status information could include:

  • availableReplicas: The actual number of ready pods.
  • conditions: A list of conditions indicating the health and progress of the application (e.g., Ready, Deployed, Degraded).
  • deploymentName: The name of the underlying Deployment managed by the controller.
  • serviceName: The name of the underlying Service.

A custom resource with status might look like:

apiVersion: stable.example.com/v1
kind: MyApp
metadata:
  name: my-web-app
spec:
  image: "nginx:1.21.6"
  replicas: 3
  port: 80
status:
  availableReplicas: 3
  deploymentName: my-web-app-deployment
  serviceName: my-web-app-service
  conditions:
  - type: Ready
    status: "True"
    lastTransitionTime: "2023-10-27T10:00:00Z"
    reason: DeploymentReady
    message: Application deployment is ready

Schema Validation (openAPIV3Schema)

This is arguably the most critical part of a CRD. The openAPIV3Schema validates the structure and types of fields within your custom resource, ensuring data integrity. It prevents users from submitting malformed or invalid configurations.

For our MyApp example, the schema would look something like this (simplified, kubebuilder generates much of this automatically):

# ... other CRD fields ...
spec:
  group: stable.example.com
  names:
    kind: MyApp
    plural: myapps
    singular: myapp
  scope: Namespaced
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                image:
                  type: string
                  description: The Docker image for the web application.
                  minLength: 1
                replicas:
                  type: integer
                  description: The desired number of replicas.
                  minimum: 1
                  default: 1
                port:
                  type: integer
                  description: The port the application listens on.
                  minimum: 1
                  maximum: 65535
              required:
                - image
                - replicas
                - port
            status:
              type: object
              properties:
                availableReplicas:
                  type: integer
                  description: The number of available replicas.
                deploymentName:
                  type: string
                  description: The name of the underlying Deployment.
                serviceName:
                  type: string
                  description: The name of the underlying Service.
                conditions:
                  type: array
                  items:
                    type: object
                    properties:
                      lastTransitionTime:
                        type: string
                        format: date-time
                      message:
                        type: string
                      reason:
                        type: string
                      status:
                        type: string
                      type:
                        type: string

This schema ensures: * image, replicas, port are required. * image is a non-empty string. * replicas is an integer, minimum 1, defaulting to 1 if not specified. * port is an integer between 1 and 65535. * The status field has availableReplicas, deploymentName, serviceName, and conditions with their respective types.

Subresources (/status, /scale)

  • /status subresource: By enabling the /status subresource, your controller can update just the status field of a custom resource without modifying its metadata.generation. This is crucial because changes to metadata.generation typically trigger a full reconciliation, and status updates are frequent but shouldn't trigger unnecessary reconciliation cycles for the spec field.
  • /scale subresource: If your custom resource represents a workload that can be scaled (like our MyApp), enabling the /scale subresource allows users to scale it using kubectl scale commands and enables integration with Horizontal Pod Autoscalers (HPAs). You'd map the spec.replicas and status.availableReplicas to the scale subresource's fields.

Version Management (versions, storage)

As your application evolves, your CRD's schema might need to change. CRDs support multiple versions (e.g., v1alpha1, v1beta1, v1).

  • served: true: Indicates that this version is exposed via the API.
  • storage: true: Specifies which version is used for storing the resource in etcd. Only one version can be marked as storage: true. When a client interacts with a different version, the API server converts the object to the storage version before saving it and converts it back to the requested version when retrieved.
  • conversion: For complex version changes, you'll need to define a conversion webhook to handle the translation between different API versions of your custom resource. This ensures compatibility and data integrity during upgrades.

Example CRD: A Simple MyApp Resource

Let's put it all together. Using kubebuilder, after running kubebuilder create api --group stable --version v1 --kind MyApp, it generates the Go types for your CRD and a CRD manifest similar to this (simplified for brevity):

# config/crd/bases/stable.example.com_myapps.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: myapps.stable.example.com
spec:
  group: stable.example.com
  names:
    kind: MyApp
    listKind: MyAppList
    plural: myapps
    singular: myapp
  scope: Namespaced
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          description: MyApp is the Schema for the myapps API
          type: object
          properties:
            apiVersion:
              type: string
            kind:
              type: string
            metadata:
              type: object
            spec:
              description: MyAppSpec defines the desired state of MyApp
              type: object
              properties:
                image:
                  description: The Docker image for the web application.
                  type: string
                  minLength: 1
                replicas:
                  description: The desired number of replicas.
                  type: integer
                  minimum: 1
                  default: 1
                port:
                  description: The port the application listens on.
                  type: integer
                  minimum: 1
                  maximum: 65535
              required:
                - image
                - replicas
                - port
            status:
              description: MyAppStatus defines the observed state of MyApp
              type: object
              properties:
                availableReplicas:
                  description: The number of available replicas.
                  type: integer
                deploymentName:
                  description: The name of the underlying Deployment.
                  type: string
                serviceName:
                  description: The name of the underlying Service.
                conditions:
                  description: Conditions list the conditions of the MyApp.
                  type: array
                  items:
                    description: "Condition contains details about an API condition"
                    properties:
                      lastTransitionTime:
                        description: lastTransitionTime is the last time the condition
                          transitioned from one status to another.
                        format: date-time
                        type: string
                      message:
                        description: message is a human readable message indicating
                          details about the transition.
                        type: string
                      reason:
                        description: reason is the reason for the condition's last
                          transition.
                        type: string
                      status:
                        description: status of the condition, one of True, False, Unknown.
                        type: string
                      type:
                        description: type of condition in CamelCase or in `foo.example.com/CamelCase`.
                        type: string
                    required:
                      - lastTransitionTime
                      - message
                      - reason
                      - status
                      - type
                    type: object
      subresources:
        status: {} # Enable /status subresource
      # You can also enable /scale if needed:
      # scale:
      #   specReplicasPath: .spec.replicas
      #   statusReplicasPath: .status.availableReplicas

This CRD manifest defines the contract for our MyApp resource. When this CRD is applied to a Kubernetes cluster, the API server gains knowledge of MyApp objects, and their lifecycle can then be managed by our custom controller. The design of this CRD is the blueprint for everything our controller will do, as it dictates what inputs it will receive and what outputs it is expected to produce.

APIPark is a high-performance AI gateway that allows you to securely access the most comprehensive LLM APIs globally on the APIPark platform, including OpenAI, Anthropic, Mistral, Llama2, Google Gemini, and more.Try APIPark now! 👇👇👇

Part 4: Building the Controller: A Step-by-Step Guide

With a solid understanding of Kubernetes extensibility, the controller pattern, and our CRD designed, we can now proceed to build the controller itself. We'll leverage kubebuilder to streamline the development process, focusing on the core logic of watching CRD changes and reconciling the desired state.

Prerequisites

Before diving into code, ensure your development environment is set up:

  • Go Language: Version 1.20 or newer. Install it from golang.org/doc/install.
  • Docker: For building and pushing controller images. Install from docker.com/products/docker-desktop.
  • Kubectl: The Kubernetes command-line tool. Ensure it's configured to access a cluster.
  • Kubebuilder CLI: The tool we'll use for scaffolding. Install instructions are on kubebuilder.io/quick-start.html.
  • KinD (Kubernetes in Docker) or Minikube: A local Kubernetes cluster for development and testing. KinD is often preferred for controller development due to its speed and simplicity.
# Install Kubebuilder (example for Linux/macOS)
os=$(go env GOOS)
arch=$(go env GOARCH)
curl -L -o kubebuilder "https://go.kubebuilder.io/dl/latest/${os}/${arch}"
chmod +x kubebuilder && mv kubebuilder /usr/local/bin/

# Install KinD (if not already present)
go install sigs.k8s.io/kind@v0.20.0 # Or latest version

kubebuilder makes starting a new controller project incredibly easy by generating a well-structured project.

  1. Initialize the Project: Create a new directory for your project and initialize it:bash mkdir myapp-controller cd myapp-controller go mod init myapp-controller # Replace myapp-controller with your desired module name kubebuilder init --domain example.com --repo myapp-controller This command sets up the basic project structure, main.go, Dockerfile, Makefile, and configuration files. --domain example.com sets the base domain for your API group (e.g., stable.example.com), and --repo specifies your Go module path.

Create API for Your Custom Resource: Now, create the API (CRD and Go types) for our MyApp resource:bash kubebuilder create api --group stable --version v1 --kind MyApp This command does several important things: * Creates api/v1/myapp_types.go: Defines the MyApp struct with Spec and Status fields, and registers it with the Kubernetes API machinery. * Creates controllers/myapp_controller.go: This is where the core reconciliation logic for your controller will live. * Generates CRD YAML manifest under config/crd/bases. * Updates main.go to register the new API and controller.After this, open api/v1/myapp_types.go and fill in the MyAppSpec and MyAppStatus structs according to your CRD design from Part 3.```go // api/v1/myapp_types.go package v1import ( metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" )// MyAppSpec defines the desired state of MyApp type MyAppSpec struct { // INSERT ADDITIONAL SPEC FIELDS - desired state of the application // Important: Run "make generate" to regenerate code after modifying this file

Image    string `json:"image,omitempty"`
Replicas *int32 `json:"replicas,omitempty"` // Use *int32 for optional/nullable
Port     int32  `json:"port,omitempty"`

}// MyAppStatus defines the observed state of MyApp type MyAppStatus struct { // INSERT ADDITIONAL STATUS FIELDS - actual state of the application // Important: Run "make generate" to regenerate code after modifying this file

AvailableReplicas int32                 `json:"availableReplicas,omitempty"`
DeploymentName    string                `json:"deploymentName,omitempty"`
ServiceName       string                `json:"serviceName,omitempty"`
Conditions        []metav1.Condition `json:"conditions,omitempty"`

}//+kubebuilder:object:root=true //+kubebuilder:subresource:status //+kubebuilder:printcolumn:name="Image",type="string",JSONPath=".spec.image" //+kubebuilder:printcolumn:name="Replicas",type="integer",JSONPath=".spec.replicas" //+kubebuilder:printcolumn:name="Available",type="integer",JSONPath=".status.availableReplicas" //+kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp"/techblog/en// MyApp is the Schema for the myapps API type MyApp struct { metav1.TypeMeta json:",inline" metav1.ObjectMeta json:"metadata,omitempty"

Spec   MyAppSpec   `json:"spec,omitempty"`
Status MyAppStatus `json:"status,omitempty"`

}//+kubebuilder:object:root=true// MyAppList contains a list of MyApp type MyAppList struct { metav1.TypeMeta json:",inline" metav1.ListMeta json:"metadata,omitempty" Items []MyApp json:"items" }func init() { SchemeBuilder.Register(&MyApp{}, &MyAppList{}) } Notice the `+kubebuilder` markers. These are used by the code generator. After modifying `myapp_types.go`, always run:bash make generate make manifests ``make generateupdates the generated code based on Go annotations (e.g., DeepCopy methods).make manifestsupdates the CRD YAML based on the Go types andkubebuilder` markers.

Understanding the Generated Code Structure

The controllers/myapp_controller.go file contains the MyAppReconciler struct and its Reconcile method, along with the SetupWithManager method.

  • MyAppReconciler: This struct holds the necessary clients and logger for your controller. go // MyAppReconciler reconciles a MyApp object type MyAppReconciler struct { client.Client Log logr.Logger Scheme *runtime.Scheme }
    • client.Client: This is the controller-runtime client, which provides methods like Get, List, Create, Update, Delete for interacting with Kubernetes API objects. It uses the manager's cache for read operations efficiently.
    • Log: A structured logger.
    • Scheme: The Kubernetes API scheme, used for tasks like creating owned objects with owner references.

Defining the Reconciler Interface

The heart of controller-runtime is the Reconcile method, which implements the Reconciler interface. This method is called by the controller-runtime manager whenever a watched resource changes.

// +kubebuilder:rbac:groups=stable.example.com,resources=myapps,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=stable.example.com,resources=myapps/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=services,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=events,verbs=create;patch

// Reconcile is part of the main kubernetes reconciliation loop which aims to
// move the current state of the cluster closer to the desired state.
// TODO(user): Modify the Reconcile function to compare the state specified by
// the MyApp object against the actual cluster state, and then
// perform operations to make the cluster state reflect the state specified by
// the user.
//
// For more details, check Reconcile and its Result here:
// - https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.16.0/pkg/reconcile
func (r *MyAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := r.Log.WithValues("myapp", req.NamespacedName)

    // Fetch the MyApp instance
    myapp := &stablev1.MyApp{}
    if err := r.Get(ctx, req.NamespacedName, myapp); err != nil {
        if apierrors.IsNotFound(err) {
            // Request object not found, could have been deleted after reconcile request.
            // Owned objects are automatically garbage collected. For additional cleanup logic,
            // use finalizers.
            log.Info("MyApp resource not found. Ignoring since object must be deleted.")
            return ctrl.Result{}, nil
        }
        // Error reading the object - requeue the request.
        log.Error(err, "Failed to get MyApp")
        return ctrl.Result{}, err
    }

    // Your reconciliation logic goes here
    // ... (details below) ...

    return ctrl.Result{}, nil
}

The +kubebuilder:rbac comments are important; kubebuilder uses these to generate the necessary Role-Based Access Control (RBAC) rules for your controller's ServiceAccount, granting it permissions to interact with myapps, deployments, services, and events.

Implementing the Reconciliation Logic for CRD Changes

Now, let's fill in the Reconcile method with our core logic. This method will be invoked whenever our MyApp custom resource is created, updated, or deleted.

Fetching the CR

The first part, fetching the MyApp instance, is already done by kubebuilder's template:

    myapp := &stablev1.MyApp{}
    if err := r.Get(ctx, req.NamespacedName, myapp); err != nil {
        if apierrors.IsNotFound(err) {
            log.Info("MyApp resource not found. Ignoring since object must be deleted.")
            return ctrl.Result{}, nil
        }
        log.Error(err, "Failed to get MyApp")
        return ctrl.Result{}, err
    }

If the resource is not found (meaning it was deleted), we return nil to stop further reconciliation for this item.

Handling Deletion (Finalizers)

When a resource is deleted, Kubernetes sets a deletion timestamp. If your controller needs to perform any cleanup before the resource is fully removed (e.g., releasing external resources, undeploying components that aren't garbage-collected by Kubernetes), you use finalizers.

    myappFinalizer := "finalizers.stable.example.com/myapp"

    // Check if the MyApp instance is marked for deletion, which is indicated by the deletion timestamp being set.
    if myapp.ObjectMeta.DeletionTimestamp.IsZero() {
        // The object is not being deleted, so if it does not have our finalizer,
        // then lets add it. This is equivalent to registering our interest in the
        // object's deletion.
        if !controllerutil.ContainsFinalizer(myapp, myappFinalizer) {
            controllerutil.AddFinalizer(myapp, myappFinalizer)
            if err := r.Update(ctx, myapp); err != nil {
                return ctrl.Result{}, err
            }
        }
    } else {
        // The object is being deleted
        if controllerutil.ContainsFinalizer(myapp, myappFinalizer) {
            // our finalizer is present, so lets handle any external dependency
            log.Info("Performing finalizer cleanup for MyApp %s/%s", myapp.Namespace, myapp.Name)

            // TODO(user): Add actual cleanup logic here
            // For example, delete associated Deployments and Services if not owned.
            // In our case, `Owns` ensures garbage collection, so this might be for external resources.

            // Remove the finalizer. Once all finalizers have been removed, the object will be deleted.
            controllerutil.RemoveFinalizer(myapp, myappFinalizer)
            if err := r.Update(ctx, myapp); err != nil {
                return ctrl.Result{}, err
            }
        }

        // Stop reconciliation as the object is being deleted
        return ctrl.Result{}, nil
    }

In our MyApp example, since we will ensure the Deployment and Service are "owned" by the MyApp custom resource, Kubernetes' garbage collector will automatically delete them when MyApp is deleted. Thus, a finalizer might not be strictly necessary for these child resources, but it's a critical pattern for cleaning up external resources (e.g., cloud storage, external database entries) that Kubernetes doesn't manage.

Reconciling Child Resources: Deployment and Service

This is the core logic where we ensure the desired Deployment and Service exist and match the MyApp spec.

Reconcile Service: Similar logic for the Service.```go // Reconcile Service foundService := &corev1.Service{} desiredService := r.serviceForMyApp(myapp)

err = r.Get(ctx, types.NamespacedName{Name: desiredService.Name, Namespace: desiredService.Namespace}, foundService)
if err != nil && apierrors.IsNotFound(err) {
    log.Info("Creating a new Service", "Service.Namespace", desiredService.Namespace, "Service.Name", desiredService.Name)
    err = r.Create(ctx, desiredService)
    if err != nil {
        log.Error(err, "Failed to create new Service", "Service.Namespace", desiredService.Namespace, "Service.Name", desiredService.Name)
        return ctrl.Result{}, err
    }
    // Service created successfully - return and requeue
    return ctrl.Result{Requeue: true}, nil
} else if err != nil {
    log.Error(err, "Failed to get Service")
    return ctrl.Result{}, err
}

// Check if the service spec is different from the desired spec.
// For services, we usually only care about selector and ports.
if !equality.Semantic.DeepDerivative(desiredService.Spec, foundService.Spec) {
    log.Info("Updating Service", "Service.Namespace", foundService.Namespace, "Service.Name", foundService.Name)
    foundService.Spec.Selector = desiredService.Spec.Selector
    foundService.Spec.Ports = desiredService.Spec.Ports
    err = r.Update(ctx, foundService)
    if err != nil {
        log.Error(err, "Failed to update Service", "Service.Namespace", foundService.Namespace, "Service.Name", foundService.Name)
        return ctrl.Result{}, err
    }
    // Service updated successfully - return and requeue
    return ctrl.Result{Requeue: true}, nil
}

```

Reconcile Deployment:```go // Reconcile Deployment foundDeployment := &appsv1.Deployment{} desiredDeployment := r.deploymentForMyApp(myapp)

err := r.Get(ctx, types.NamespacedName{Name: desiredDeployment.Name, Namespace: desiredDeployment.Namespace}, foundDeployment)
if err != nil && apierrors.IsNotFound(err) {
    log.Info("Creating a new Deployment", "Deployment.Namespace", desiredDeployment.Namespace, "Deployment.Name", desiredDeployment.Name)
    err = r.Create(ctx, desiredDeployment)
    if err != nil {
        log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", desiredDeployment.Namespace, "Deployment.Name", desiredDeployment.Name)
        return ctrl.Result{}, err
    }
    // Deployment created successfully - return and requeue
    return ctrl.Result{Requeue: true}, nil // Requeue to ensure status update after creation
} else if err != nil {
    log.Error(err, "Failed to get Deployment")
    return ctrl.Result{}, err
}

// Check if the deployment spec is different from the desired spec.
// We only compare "relevant" fields, not metadata or status.
// A more robust comparison might involve deep equality checks or a hash of the spec.
if !equality.Semantic.DeepDerivative(desiredDeployment.Spec, foundDeployment.Spec) {
    log.Info("Updating Deployment", "Deployment.Namespace", foundDeployment.Namespace, "Deployment.Name", foundDeployment.Name)
    foundDeployment.Spec = desiredDeployment.Spec // Copy the desired spec
    err = r.Update(ctx, foundDeployment)
    if err != nil {
        log.Error(err, "Failed to update Deployment", "Deployment.Namespace", foundDeployment.Namespace, "Deployment.Name", foundDeployment.Name)
        return ctrl.Result{}, err
    }
    // Deployment updated successfully - return and requeue
    return ctrl.Result{Requeue: true}, nil
}

`` * We try toGetthe Deployment. * If not found, weCreateit. * If found, we compare itsSpecwith ourdesiredDeployment.Spec. If they differ, weUpdatethe existing Deployment.equality.Semantic.DeepDerivativeis a helper fromclient-gofor comparing Go structs. * WeRequeue: true` after create/update to ensure the controller re-evaluates the state quickly, which is often necessary to pick up status updates or immediately reconcile the next child resource.

Define the Desired Child Resources: Create functions to generate the desired Deployment and Service based on the MyApp's spec.```go import ( appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/equality" "k8s.io/apimachinery/pkg/util/intstr" "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" )// deploymentForMyApp returns a MyApp Deployment object func (r MyAppReconciler) deploymentForMyApp(myapp stablev1.MyApp) *appsv1.Deployment { labels := labelsForMyApp(myapp.Name) replicas := myapp.Spec.Replicas if replicas == nil { defaultReplicas := int32(1) replicas = &defaultReplicas }

dep := &appsv1.Deployment{
    ObjectMeta: metav1.ObjectMeta{
        Name:      myapp.Name + "-deployment",
        Namespace: myapp.Namespace,
        Labels:    labels,
    },
    Spec: appsv1.DeploymentSpec{
        Replicas: replicas,
        Selector: &metav1.LabelSelector{
            MatchLabels: labels,
        },
        Template: corev1.PodTemplateSpec{
            ObjectMeta: metav1.ObjectMeta{
                Labels: labels,
            },
            Spec: corev1.PodSpec{
                Containers: []corev1.Container{{
                    Name:  "web",
                    Image: myapp.Spec.Image,
                    Ports: []corev1.ContainerPort{{
                        ContainerPort: myapp.Spec.Port,
                    }},
                }},
            },
        },
    },
}
// Set MyApp instance as the owner and controller
// This ensures garbage collection and proper ownership
ctrl.SetControllerReference(myapp, dep, r.Scheme)
return dep

}// serviceForMyApp returns a MyApp Service object func (r MyAppReconciler) serviceForMyApp(myapp stablev1.MyApp) *corev1.Service { labels := labelsForMyApp(myapp.Name)

svc := &corev1.Service{
    ObjectMeta: metav1.ObjectMeta{
        Name:      myapp.Name + "-service",
        Namespace: myapp.Namespace,
        Labels:    labels,
    },
    Spec: corev1.ServiceSpec{
        Selector: labels,
        Ports: []corev1.ServicePort{{
            Protocol:   corev1.ProtocolTCP,
            Port:       myapp.Spec.Port,
            TargetPort: intstr.FromInt(int(myapp.Spec.Port)),
        }},
        Type: corev1.ServiceTypeClusterIP,
    },
}
ctrl.SetControllerReference(myapp, svc, r.Scheme)
return svc

}func labelsForMyApp(name string) map[string]string { return map[string]string{"app": "myapp", "myapp_cr": name} } ``ctrl.SetControllerReferenceis crucial here. It establishes an owner reference from the child resource (Deployment/Service) back to the parentMyAppresource. This enables Kubernetes' garbage collection: whenMyApp` is deleted, its owned children are automatically deleted too.

Updating CR Status

After reconciling child resources, it's crucial to update the MyApp's status field to reflect the actual state.

    // Update the MyApp status with the latest information
    statusChanged := false

    if myapp.Status.AvailableReplicas != foundDeployment.Status.AvailableReplicas {
        myapp.Status.AvailableReplicas = foundDeployment.Status.AvailableReplicas
        statusChanged = true
    }
    if myapp.Status.DeploymentName != foundDeployment.Name {
        myapp.Status.DeploymentName = foundDeployment.Name
        statusChanged = true
    }
    if myapp.Status.ServiceName != foundService.Name {
        myapp.Status.ServiceName = foundService.Name
        statusChanged = true
    }

    // Update conditions
    // Simplistic condition management. For production, use k8s.io/apimachinery/pkg/api/meta/conditions helpers
    isReady := foundDeployment.Status.AvailableReplicas == *myapp.Spec.Replicas && *myapp.Spec.Replicas > 0
    if isReady {
        // Ensure Ready condition is set to True
        newCondition := metav1.Condition{
            Type:               "Ready",
            Status:             metav1.ConditionTrue,
            LastTransitionTime: metav1.Now(),
            Reason:             "DeploymentReady",
            Message:            "Application deployment is ready",
        }
        if !containsCondition(myapp.Status.Conditions, newCondition) {
            myapp.Status.Conditions = setCondition(myapp.Status.Conditions, newCondition)
            statusChanged = true
        }
    } else {
        // Ensure Ready condition is set to False or Unknown
        newCondition := metav1.Condition{
            Type:               "Ready",
            Status:             metav1.ConditionFalse,
            LastTransitionTime: metav1.Now(),
            Reason:             "DeploymentNotReady",
            Message:            "Application deployment is not yet ready or scaling",
        }
        if !containsCondition(myapp.Status.Conditions, newCondition) {
            myapp.Status.Conditions = setCondition(myapp.Status.Conditions, newCondition)
            statusChanged = true
        }
    }

    if statusChanged {
        log.Info("Updating MyApp status")
        err = r.Status().Update(ctx, myapp) // Use r.Status().Update() for status subresource
        if err != nil {
            log.Error(err, "Failed to update MyApp status")
            return ctrl.Result{}, err
        }
        return ctrl.Result{}, nil // Status update might trigger another reconcile
    }

    // Reconcile successfully, no changes needed
    return ctrl.Result{}, nil

r.Status().Update() is used specifically for updating the /status subresource, which is more efficient. Helper functions containsCondition and setCondition (you'd implement these yourself, or use conditions package helpers) are used to manage the list of metav1.Condition structs robustly.

Error Handling and Retries

The Reconcile method signature returns (ctrl.Result, error). * If error is non-nil, controller-runtime will automatically re-enqueue the item with exponential backoff. * ctrl.Result{Requeue: true} tells controller-runtime to re-enqueue the item immediately after a short delay (e.g., after creating a resource, we might want to check its status very soon). * ctrl.Result{RequeueAfter: time.Second * 5} tells controller-runtime to re-enqueue after a specific duration. This is useful for polling external systems or performing periodic checks. * If both ctrl.Result is empty and error is nil, the item is considered successfully reconciled and removed from the Workqueue.

Setting up the Controller Manager

The SetupWithManager method in myapp_controller.go configures how your controller interacts with the controller-runtime Manager.

// SetupWithManager sets up the controller with the Manager.
func (r *MyAppReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&stablev1.MyApp{}).             // Watch for MyApp objects
        Owns(&appsv1.Deployment{}).        // Watch Deployments that MyApp owns
        Owns(&corev1.Service{}).           // Watch Services that MyApp owns
        Complete(r)
}
  • For(&stablev1.MyApp{}): This tells the controller to watch MyApp objects and trigger reconciliation whenever a MyApp is created, updated, or deleted.
  • Owns(&appsv1.Deployment{}) and Owns(&corev1.Service{}): This is crucial. It configures the controller to also watch Deployment and Service objects that are owned by a MyApp object (via SetControllerReference). If an owned Deployment or Service changes, the owning MyApp's Reconcile method will be invoked. This ensures that if someone manually modifies an owned Deployment, our controller will detect it and correct it back to the desired state specified by MyApp.

Predicates for Filtering Events

For performance, you might want to filter which events trigger a reconciliation. controller-runtime provides Predicates (e.g., predicate.GenerationChangedPredicate, predicate.AnnotationChangedPredicate). For example, you might only want to reconcile if the spec (generation) of your CRD changes, not just its status or metadata.

// Example with predicate
return ctrl.NewControllerManagedBy(mgr).
    For(&stablev1.MyApp{}, builder.WithPredicates(predicate.GenerationChangedPredicate{})).
    Owns(&appsv1.Deployment{}).
    Owns(&corev1.Service{}).
    Complete(r)

This would reduce the number of reconciliations triggered solely by status updates.

Debugging and Testing

Local Development with KinD/Minikube: You can run your controller locally against a KinD or Minikube cluster: ```bash # Ensure your KUBECONFIG points to your local cluster export KUBECONFIG="$(kind get kubeconfig-path)" # or minikube kubeconfig

Install your CRD to the cluster

make install

Run the controller locally

make run `` This allows you to set breakpoints, use a debugger, and observe logs directly. * **Unit Tests, Integration Tests**:kubebuildergenerates basic test files (controllers/myapp_controller_test.go). * **Unit tests** should test individual functions or small logical units without interacting with a live cluster. * **Integration tests** (usingenvtestfromcontroller-runtime/pkg/envtest) set up a minimal API server and etcd instance locally, allowing you to test your controller's reconciliation loop against a "real" (but in-memory) Kubernetes environment without needing a full cluster. * **Logging Strategies**: User.Logfor structured logging. Add key-value pairs (e.g.,log.Info("Creating deployment", "Deployment.Name", name)) for easy filtering and analysis. Events (corev1.Event) are also useful for user-facing status updates (e.g.,kubectl describe myapp`).

This detailed guide provides the blueprint for building a robust Kubernetes controller. The combination of kubebuilder's scaffolding, controller-runtime's abstractions, and client-go's powerful primitives allows you to extend Kubernetes with your own custom logic, transforming it into an even more powerful platform for your specific applications.

Part 5: Advanced Controller Concepts and Best Practices

Building a functional controller is one thing; building a production-ready, resilient, and observable controller is another. This section delves into advanced concepts and best practices that elevate your controller from a basic implementation to a robust, enterprise-grade solution.

Idempotency: Crucial for Robust Controllers

We've touched upon idempotency earlier, but its importance cannot be overstated. A controller's reconciliation logic must be idempotent, meaning applying the same operation multiple times should yield the same result as applying it once. This is critical because:

  • Retries: Due to transient errors, network issues, or API server unavailability, your reconciliation loop might be re-executed for the same object multiple times.
  • Event Duplication: Kubernetes' event delivery model is "at least once," meaning events can sometimes be delivered more than once.
  • Controller Restarts: If your controller crashes and restarts, it will reprocess all items in its workqueue.

To ensure idempotency: * Always check for existence before creating: Instead of blindly calling client.Create(), first attempt client.Get(). If the resource exists, decide whether to Update() it or do nothing. * Compare and update only when necessary: Don't update a child resource if its current state already matches the desired state. This reduces unnecessary API server calls and metadata.resourceVersion increments, which can trigger further reconciliations. * Use owner references: As demonstrated, ctrl.SetControllerReference ensures that child resources are automatically garbage collected when the parent is deleted, reducing manual cleanup logic and potential resource leaks.

Event Handling and Race Conditions: Ensuring Consistency

While controller-runtime and client-go abstract away much of the complexity, understanding potential race conditions and how to mitigate them is essential.

  • Stale Cache: Informers provide an eventually consistent view of the cluster. There's a tiny window where your controller's cache might be slightly behind the API server's true state. For sensitive operations, perform a direct Get from the API server (bypassing the cache) before making critical decisions, though this should be used sparingly due to performance implications.
  • Concurrent Reconciliations: The workqueue ensures that for a single object, only one reconciliation loop runs at a time. However, multiple reconciliation loops for different objects run concurrently. Ensure your controller's shared state (if any) is properly synchronized (e.g., with mutexes or channels) to avoid race conditions, though generally, controllers should be stateless per reconciliation.
  • Optimistic Concurrency Control: When updating resources, Kubernetes uses resourceVersion for optimistic concurrency control. If you Get a resource, modify it, and then Update it, but another process modified it in between, your Update will fail with a "conflict" error. controller-runtime automatically handles retries for these conflict errors, but your reconciliation logic must be ready for retries.

Admission Webhooks: Validating and Mutating CRs Before Persistence

Admission webhooks provide a powerful mechanism to intercept requests to the Kubernetes API server before an object is persisted. They allow you to define custom logic for validating or mutating resources.

  • Mutating Webhooks: Intercepts requests and can modify (mutate) the resource. Common uses include:
    • Setting default values: Automatically populate fields if they are missing (e.g., set default replicas if not specified in MyApp.spec.replicas). This simplifies user manifests.
    • Injecting sidecars: Automatically inject sidecar containers (e.g., a logging agent or a service mesh proxy) into pods based on certain criteria.
  • Validating Webhooks: Intercepts requests and can reject them if the resource violates custom validation rules that cannot be expressed by OpenAPI v3 schema. Common uses include:
    • Complex business logic validation: For example, ensure that MyApp.spec.port is not already in use by another MyApp instance, or that MyApp.spec.image comes from an approved registry.
    • Immutable fields: Prevent certain fields from being changed after creation.
    • Cross-resource validation: Validate a resource based on the state of other resources in the cluster.

kubebuilder provides excellent support for generating and managing webhooks. They are deployed as Kubernetes services and expose an HTTPS endpoint that the API server calls.

External Dependencies: Interacting with Services Outside Kubernetes

Many controllers need to interact with systems outside the Kubernetes cluster (e.g., cloud provider APIs, external databases, configuration management tools).

  • Security Considerations:
    • Secrets: Store credentials for external systems in Kubernetes Secrets and access them securely from your controller. Avoid hardcoding sensitive information.
    • Service Accounts and RBAC: Ensure your controller's Service Account has only the necessary RBAC permissions within Kubernetes. For external systems, use appropriate authentication mechanisms (e.g., IAM roles, OAuth tokens).
  • Rate Limiting External Calls: External APIs often have rate limits. Your controller should implement client-side rate limiting and exponential backoff for external calls to avoid being throttled or blacklisted.
  • Error Handling: Treat external calls as potentially unreliable. Implement robust error handling, retries, and circuit breakers. If an external system is down, your controller should degrade gracefully, perhaps by updating the CR's status to reflect the degraded state and retrying later.
  • Idempotency for External Calls: Ensure external calls are also idempotent. If your controller repeatedly tries to create an external resource (e.g., an S3 bucket), the external API should handle repeated creation requests gracefully.

Metrics and Observability

A production-grade controller must be observable. You need to know if it's healthy, performing well, and doing its job correctly.

  • Prometheus Metrics: controller-runtime automatically exposes some basic metrics (e.g., reconciliation duration, workqueue depth). You can add custom metrics using the Prometheus Go client library (e.g., gauge for number of active MyApps, counter for successful/failed reconciliations). This allows you to monitor controller performance and identify bottlenecks.
  • Structured Logging: Use the logr.Logger provided by controller-runtime. Log at appropriate levels (debug, info, warn, error) and include key-value pairs (log.Info("message", "key", "value")) to make logs easily parsable and queryable by tools like Fluentd, Loki, or Splunk. Avoid excessive logging, especially in tight loops, as it can impact performance.
  • Events API: The Kubernetes Events API is a great way to communicate important, user-facing information about your custom resources. Your controller can create corev1.Event objects associated with your MyApp resource (e.g., "MyApp deployment started", "MyApp scaled up", "Failed to create Service"). Users can then see these events using kubectl describe myapp.

Security

Security is paramount in any Kubernetes component.

  • RBAC for Controller Service Account: Adhere to the principle of least privilege. Your controller's Service Account should only have the minimum necessary permissions (defined via Role / ClusterRole and RoleBinding / ClusterRoleBinding) to watch, get, create, update, and delete the specific resources it manages. kubebuilder helps generate these initially.
  • Secure Communication: Ensure all communication, especially to external systems or webhooks, uses TLS. Kubernetes internal communication is typically secure by default.
  • Container Image Security: Build your controller's container image using a minimal base image (e.g., distroless) to reduce the attack surface. Scan your images for vulnerabilities.

Graceful Shutdowns

Controllers should handle termination signals gracefully. When Kubernetes sends a SIGTERM signal, your controller should: * Stop accepting new work from the workqueue. * Finish processing any items currently being reconciled. * Flush logs and metrics. * Close any open connections to external systems.

controller-runtime's Manager handles much of this by default, but if you have custom long-running tasks or external connections, you need to implement your own cleanup logic using context.Context cancellation or Go's sync.WaitGroup.

Performance Considerations

Optimizing your controller for performance ensures it can scale to manage a large number of custom resources efficiently.

  • Efficient Informer Usage: Leverage SharedInformers and Listers for fast, cached reads. Avoid direct Get calls to the API server within the reconciliation loop unless absolutely necessary for strong consistency.
  • Minimize API Server Calls: Each API call has an overhead. Group related updates (e.g., patch instead of multiple updates) where possible. Only update a resource if its desired state truly differs from its current state.
  • Predicate Filtering: Use predicates to filter events and reduce the number of times Reconcile is called unnecessarily.
  • Appropriate Workqueue Rate Limiting: Configure the RateLimiter on your workqueue to prevent overwhelming downstream systems (including the Kubernetes API server) during error conditions.
  • Resource Requests and Limits: Configure appropriate CPU and memory requests and limits for your controller's Pod to ensure it has enough resources without consuming too much.

The Bigger Picture: API Management and Beyond

As you build controllers and extend Kubernetes with custom resources, you're essentially creating a custom API plane for your applications. This allows developers to interact with your specific application components using the familiar Kubernetes API and tooling. However, as the number and complexity of these custom APIs grow, or when these custom resources drive external services, the need for a comprehensive API management strategy becomes apparent.

This is where solutions like APIPark come into play. While your Kubernetes controller excels at managing the lifecycle of your custom resources within the cluster, APIPark, as an open-source AI Gateway and API Management Platform, offers capabilities that complement this effort by providing end-to-end API lifecycle management, quick integration of various AI models (which could be exposed via custom resources and controllers), unified API formats, and enterprise-grade security and observability. If your custom resources represent services or AI models that need to be exposed, managed, and consumed by external applications or other teams, APIPark can act as the unifying layer, offering features like authentication, traffic management, versioning, and a developer portal for all your APIs, whether they are backed by native Kubernetes services or your custom controller-managed resources. It provides a centralized control point for API discovery, access control, and performance monitoring, extending the governance of your custom resources to the broader enterprise ecosystem.

Conclusion

The ability to extend Kubernetes with Custom Resource Definitions and manage them with robust controllers is a cornerstone of modern cloud-native development. It liberates users from the constraints of predefined resource types, empowering them to embed domain-specific operational knowledge directly into the heart of their orchestration platform. By building a Kubernetes controller to watch CRD changes, you are not merely automating tasks; you are fundamentally enhancing Kubernetes' intelligence and capability to manage your unique application landscapes with the same declarative elegance and resilience as its native components.

Throughout this extensive guide, we have systematically deconstructed the core principles of Kubernetes extensibility, from the foundational concepts of CRDs and the control loop pattern to the practicalities of implementation using kubebuilder and controller-runtime. We delved into the intricacies of designing effective CRDs, crafting idempotent reconciliation logic, and embracing advanced techniques such as webhooks, external dependency management, and comprehensive observability. The journey culminates in the understanding that a well-architected controller is a powerful operator that reduces toil, increases reliability, and accelerates the development of complex distributed systems.

The future of cloud-native computing undoubtedly lies in increasingly specialized and intelligent automation. As applications grow in complexity, encompassing diverse technologies, from traditional microservices to sophisticated AI workloads, the ability to tailor your orchestration layer becomes indispensable. With the knowledge gained here, you are now equipped to build controllers that not only react to changes but proactively steer your systems toward their desired state, ensuring operational consistency and unlocking new frontiers of innovation within your Kubernetes environments. This mastery will enable you to contribute to a more autonomous, efficient, and scalable future for your infrastructure and applications.


Frequently Asked Questions (FAQs)

  1. What is the primary difference between a Custom Resource (CR) and a Custom Resource Definition (CRD)? A Custom Resource Definition (CRD) is a schema definition that tells Kubernetes about a new custom resource type, including its name, scope, and validation rules. It defines what the new resource looks like. A Custom Resource (CR) is an actual instance of that custom resource type. It's an object created by a user that adheres to the schema defined in the CRD, representing the desired state of a specific application component or concept. Think of a CRD as a class definition and a CR as an object instantiated from that class.
  2. Why do I need a Kubernetes Controller if I can define a CRD? A CRD only defines the data structure and schema for your custom resources; it does not imbue Kubernetes with any logic on how to manage instances of that resource. This management logic is the role of a Kubernetes Controller. The controller continuously watches CR instances, compares their desired state (defined in the CR's spec) with the actual cluster state, and takes actions to reconcile any differences. Without a controller, your custom resources would be static data objects in the API server, incapable of influencing the cluster's operational state.
  3. What are Informers and Workqueues, and why are they important for a controller? Informers are a client-go component that efficiently watches for changes (Add, Update, Delete) to specific Kubernetes resources and maintains a local, read-only cache of these resources. This reduces load on the API server and provides fast lookup capabilities. Workqueues (specifically RateLimitingQueue) are used to decouple event handling from the actual reconciliation logic. When an Informer detects a change, it adds the affected object's key to the Workqueue. Worker goroutines then pull items from the Workqueue, process them, and handle retries with exponential backoff for failed operations. Together, Informers and Workqueues ensure efficient, resilient, and ordered processing of resource events, making controllers scalable and robust.
  4. When should I use kubebuilder or operator-sdk versus directly using client-go or controller-runtime? For most new controller projects, especially those following the Operator pattern, kubebuilder or operator-sdk are highly recommended. They provide scaffolding, code generation, and an opinionated framework that significantly reduces boilerplate and enforces best practices, accelerating development. controller-runtime is a powerful library used by kubebuilder and operator-sdk, offering a good balance between abstraction and flexibility if you prefer more manual control. client-go is the foundational, lowest-level Go client library for Kubernetes; it provides maximum control but requires significant boilerplate, making it suitable mainly for highly specialized use cases or for building other Kubernetes tooling, rather than full-fledged controllers.
  5. What are Finalizers, and why are they necessary in a Kubernetes Controller? Finalizers are special keys added to a Kubernetes object's metadata.finalizers list. When an object with finalizers is marked for deletion (i.e., its metadata.deletionTimestamp is set), Kubernetes will not physically delete the object from the etcd database until all its finalizers have been removed. This mechanism is crucial for controllers to perform necessary cleanup operations for resources that exist outside of Kubernetes (e.g., deleting an AWS S3 bucket, unregistering an external service) or for complex cleanup within Kubernetes that isn't handled by standard garbage collection. Your controller adds its finalizer when it starts managing an object and removes it only after all required cleanup is complete, ensuring a graceful and complete termination of managed resources.

🚀You can securely and efficiently call the OpenAI API on APIPark in just two steps:

Step 1: Deploy the APIPark AI gateway in 5 minutes.

APIPark is developed based on Golang, offering strong product performance and low development and maintenance costs. You can deploy APIPark with a single command line.

curl -sSO https://download.apipark.com/install/quick-start.sh; bash quick-start.sh
APIPark Command Installation Process

In my experience, you can see the successful deployment interface within 5 to 10 minutes. Then, you can log in to APIPark using your account.

APIPark System Interface 01

Step 2: Call the OpenAI API.

APIPark System Interface 02