Build Your controller to watch for changes to crd
Kubernetes has fundamentally reshaped how we deploy and manage applications, ushering in an era of declarative infrastructure and cloud-native practices. At its core, Kubernetes operates on a continuous reconciliation loop, constantly striving to bring the actual state of your cluster in line with your desired state. This powerful paradigm is driven by controllers—the unsung heroes that watch for changes, react to events, and enforce policies across your cluster. While Kubernetes offers a rich set of built-in resources like Deployments, Services, and Pods, real-world applications often demand custom behaviors and domain-specific abstractions. This is where Custom Resource Definitions (CRDs) come into play, allowing you to extend the Kubernetes API with your own resource types.
This comprehensive guide will take you on an in-depth journey into the world of Kubernetes controllers, focusing specifically on how to build one that diligently watches for changes to your custom resources defined by CRDs. We will dissect the architectural components, explore the critical role of the Kubernetes API, walk through the development process with illustrative examples, and discuss advanced considerations to ensure your controller is robust, scalable, and production-ready. By the end of this article, you will possess a profound understanding of how to craft controllers that not only observe but also intelligently act upon the evolving state of your custom resources, unlocking a new level of automation and flexibility within your Kubernetes environment.
Deconstructing the Kubernetes Control Plane and the Role of the API
To truly grasp the essence of building a Kubernetes controller, it's imperative to first understand the foundational architecture of Kubernetes itself, particularly the central role played by the Kubernetes API Server. Often described as the "brain" of the cluster, the API Server is the primary interface through which all internal and external components interact with Kubernetes. It's an HTTP RESTful API that exposes the cluster's state, enabling clients to create, update, delete, and retrieve API objects. Every single operation within a Kubernetes cluster—from scheduling a Pod to scaling a Deployment—is initiated by an interaction with the API Server.
The Kubernetes API Server serves as the single source of truth for the entire cluster. All cluster components, including the scheduler, controller manager, and kubelet, communicate exclusively through this API. This design choice ensures consistency, provides a unified interface for various operations, and allows for robust authentication and authorization mechanisms to be applied centrally. When a user or an automated process wants to declare a desired state, they submit an API object (typically in YAML or JSON format) to the API Server. This object then becomes part of the cluster's persistent state in etcd, the distributed key-value store that backs Kubernetes.
The concept of an "API object" is fundamental here. In Kubernetes, everything is an API object. A Pod, a Service, a ConfigMap, a Secret—these are all structured data representations that the API Server understands and stores. Each API object has a defined schema, specifying its apiVersion, kind, metadata (name, namespace, labels, annotations), and a spec field that describes its desired state. Crucially, many API objects also have a status field, which is populated by controllers to report the actual, current state of the resource. This separation of spec and status is a cornerstone of the declarative model, allowing users to declare what they want without needing to worry about how it's achieved, leaving the "how" to the controllers.
Controllers are clients of the Kubernetes API. They don't interact directly with etcd; instead, they continuously watch the API Server for changes to specific API objects. When a change occurs—a new object is created, an existing one is updated, or one is deleted—the controller responsible for that object type is notified. It then fetches the current state of the object, compares it with the desired state (as specified in the spec), and takes actions to reconcile any discrepancies. This reconciliation loop is the heart of Kubernetes' self-healing and automation capabilities. If a desired state is declared but not met, the controller will keep trying until it is. This robust, continuous feedback loop ensures that the cluster maintains its health and desired configuration, reacting dynamically to both internal and external changes. This reliance on the API as the sole communication channel ensures a highly decoupled and extensible system, paving the way for custom resources and controllers to seamlessly integrate and extend Kubernetes' capabilities.
Understanding Custom Resource Definitions (CRDs)
While Kubernetes provides a powerful set of built-in resources that cover a wide range of use cases, real-world applications often have unique, domain-specific requirements that go beyond these standard types. Imagine needing to represent a database instance, a message queue, a machine learning model, or a complex application stack as a first-class citizen within Kubernetes. This is precisely the problem that Custom Resource Definitions (CRDs) solve. CRDs allow you to extend the Kubernetes API by defining your own custom resource types, making them just as integrated and manageable as native resources like Pods or Deployments.
The motivation behind CRDs is profound. Before their introduction, extending Kubernetes often involved using ThirdPartyResources (TPRs), which were less robust and had several limitations. CRDs, introduced in Kubernetes 1.7 and stabilized in 1.16, provide a powerful and native mechanism for API extension. By defining a CRD, you tell the Kubernetes API Server about a new kind of object it should recognize. Once registered, you can create instances of your custom resource (CRs) using standard Kubernetes commands (kubectl create, kubectl get, kubectl apply), and these instances will be stored in etcd and exposed through the Kubernetes API like any other native resource.
The structure of a CRD itself is an API object, defined by its apiVersion, kind, and metadata. The most important part of a CRD definition is its spec, which outlines the schema for your custom resources and how they should behave. Key fields within the CRD spec include:
group: A logical grouping for your custom resources, typically a reverse domain name (e.g.,stable.example.com). This helps avoid naming collisions and organizes resources.versions: Defines one or more API versions for your custom resource, each with its own schema. This allows for evolving your custom resources over time without breaking backward compatibility. Each version has aname(e.g.,v1alpha1,v1),served(boolean indicating if the version is enabled), andstorage(boolean indicating which version is stored in etcd).names: Specifies the various names for your custom resource, includingplural(used in URLs, e.g.,machinemodels),singular(e.g.,machinemodel),kind(the object type, e.g.,MachineModel), andshortNames(optional, forkubectlshortcuts, e.g.,mm).scope: Determines if the custom resource isNamespacedorClusterscoped.Namespacedresources exist within a specific namespace, whileClusterresources are unique across the entire cluster.validation: This crucial section, often using OpenAPI v3 schema, allows you to define the structure, data types, and constraints for thespecandstatusfields of your custom resources. Robust validation prevents malformed resources from being accepted by the API Server, ensuring data integrity and predictable behavior for your controllers.subresources: CRDs can exposestatusandscalesubresources, similar to native Kubernetes resources. Thestatussubresource allows controllers to update the status of a custom resource independently from itsspec, enabling efficient updates without triggering unnecessary reconciliation cycles for otherspecchanges. Thescalesubresource allows custom resources to integrate withkubectl scaleand Horizontal Pod Autoscalers.
Consider a hypothetical MachineLearningModel CRD designed to manage the lifecycle of an AI model within Kubernetes. Its spec might include fields for the model's container image, required computational resources, API endpoint configuration, and training data source. Its status would then report the model's deployment state, health, and perhaps inference API endpoint URL once active.
Example MachineLearningModel CRD:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: machinemodels.stable.example.com
spec:
group: stable.example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
image:
type: string
description: "The container image for the machine learning model."
modelPath:
type: string
description: "Path to the model artifact within the container."
resources:
type: object
properties:
cpu: { type: string }
memory: { type: string }
gpu: { type: integer, minimum: 0 }
required: ["cpu", "memory"]
minReplicas:
type: integer
minimum: 1
default: 1
maxReplicas:
type: integer
minimum: 1
default: 1
endpointType:
type: string
enum: ["REST", "gRPC"]
default: "REST"
required: ["image", "modelPath", "resources"]
status:
type: object
properties:
phase:
type: string
enum: ["Pending", "Deploying", "Ready", "Failed"]
default: "Pending"
inferenceURL:
type: string
description: "The URL where the model's inference API is exposed."
observedGeneration:
type: integer
format: int64
description: "The most recent generation observed by the controller."
replicas:
type: integer
description: "The number of active replicas of the model."
conditions:
type: array
items:
type: object
properties:
type: { type: string }
status: { type: string, enum: ["True", "False", "Unknown"] }
reason: { type: string }
message: { type: string }
lastTransitionTime: { type: string, format: date-time }
required: ["type", "status"]
scope: Namespaced
names:
plural: machinemodels
singular: machinemodel
kind: MachineModel
shortNames:
- mm
subresources:
status: {}
Once this CRD is applied to your cluster, you can create MachineModel instances like this:
apiVersion: stable.example.com/v1
kind: MachineModel
metadata:
name: image-classifier-v1
namespace: default
spec:
image: "my-registry/image-classifier:v1.2.0"
modelPath: "/techblog/en/app/models/image_classifier.pb"
resources:
cpu: "500m"
memory: "1Gi"
gpu: 1
minReplicas: 2
maxReplicas: 5
endpointType: "REST"
The benefits of using CRDs are substantial. They provide a powerful abstraction layer, allowing platform teams to expose complex infrastructure or application components as simple, self-service Kubernetes objects. This improves consistency, reduces operational burden, and empowers developers to consume services declaratively. Moreover, by integrating deeply with the Kubernetes API, CRDs inherit all the benefits of Kubernetes RBAC, auditing, and client tools. However, a CRD alone is a static definition; it doesn't do anything. To make your custom resources come alive, you need a controller that watches for their changes and acts upon them.
The Architecture of a Kubernetes Controller
Building a Kubernetes controller that watches for changes to CRDs involves orchestrating several key components from the client-go library, the official Go client for the Kubernetes API. These components work together to ensure that your controller can efficiently and reliably observe cluster state, process events, and reconcile differences. Understanding each part is crucial for developing robust and performant controllers.
Client-go: The Gateway to the Kubernetes API
At the foundation of any Go-based Kubernetes controller is client-go. This library provides the necessary tools to interact with the Kubernetes API Server. It offers different types of clients to suit various needs:
- Clientset: The most common client, generated for known Kubernetes types (and custom types, if you generate client code for them). It provides typed access to resources (e.g.,
corev1.Pods("default").Get(...)). When you define a CRD and generate client code for it, a clientset for your custom resource will be created, offering strong typing and compile-time checks for your specific CRD. - Dynamic Client: Provides untyped access to any Kubernetes resource, including CRDs, without needing generated client code. This is useful for building generic tools but requires careful handling of
unstructured.Unstructuredobjects and lacks compile-time safety. - RESTClient: The lowest-level client, interacting directly with the Kubernetes API as an HTTP client. It's used internally by
ClientsetandDynamic Clientbut is rarely used directly by controller developers.
When you're building a controller for a specific CRD, you'll typically use a clientset generated for your custom resource. This provides a familiar and type-safe way to Get, List, Create, Update, and Delete your custom resources.
Scheme and Codec: client-go also relies heavily on a Scheme which maps Go types to Kubernetes apiVersion and kind strings. This allows the client to know how to serialize and deserialize objects to and from the wire format (JSON/YAML). When you define your CRD, you'll need to add your custom resource's type to a scheme so client-go can properly handle it.
Informers: Efficiently Watching for Changes
Directly Listing and Watching the Kubernetes API Server for every controller would be inefficient and place undue burden on the API Server. This is where informers come in. An informer is a pattern provided by client-go that abstracts away the complexities of List and Watch calls, providing a local, eventually consistent cache of Kubernetes objects.
SharedInformer: This is the most critical component for controllers. ASharedInformerperforms an initialListcall to populate its cache and then maintains that cache by continuouslyWatching the API Server for subsequent changes. When new events (Add, Update, Delete) occur, it updates its local cache and notifies registered event handlers. The "shared" aspect is important: multiple controllers within the same process can share a single informer instance for a given resource type, reducing the number ofListandWatchcalls to the API Server.Lister: AListerprovides a read-only interface to the informer's local cache. Controllers use listers to quickly retrieve objects from memory without making network calls to the API Server. This is essential for the reconciliation loop, as it allows controllers to efficiently get the current state of a resource and its related objects.DeltaFIFO: Internally, informers use aDeltaFIFOqueue to store events. This queue buffers events and ensures that they are processed in order and that no events are missed, even during network partitions or temporary API Server unavailability. It also helps in handling duplicate events by only storing the most recent state.
Event Handlers: Informers allow you to register ResourceEventHandler functions that are invoked when an event occurs: * AddFunc(obj interface{}): Called when a new object is added to the cache. * UpdateFunc(oldObj, newObj interface{}): Called when an existing object is modified. * DeleteFunc(obj interface{}): Called when an object is removed from the cache.
These handlers are responsible for taking the received object (or objects in the case of an update) and adding its identifying key (e.g., namespace/name) to a workqueue for asynchronous processing.
Workqueue: Reliable Event Processing
The workqueue is a crucial component that decouples event reception from event processing, ensuring reliable and rate-limited execution of reconciliation logic. When an informer's event handler detects a change, it doesn't immediately execute the reconciliation logic. Instead, it adds the key of the affected resource (e.g., "default/my-machinemodel") to a workqueue.
Key characteristics of the workqueue:
- Deduplication: If multiple events occur for the same resource in quick succession (e.g., two updates), the workqueue ensures that the key is only present once. When the controller processes that key, it will fetch the latest state from the informer's cache, effectively handling all intermediate updates.
- Rate-limiting and Retries: Workqueues are often rate-limited, preventing a single problematic resource from flooding the controller with events. More importantly, if a reconciliation attempt fails (e.g., due to a temporary network error or dependency unavailability), the workqueue allows the controller to re-enqueue the item with an exponential backoff, ensuring that transient errors don't lead to permanent failures.
- Order Guarantees: While workqueues typically don't guarantee strict global ordering of all items, they process items for a given key serially. This prevents race conditions where an older state might overwrite a newer state for the same resource.
The controller typically has one or more worker goroutines that continuously pull items from the workqueue, process them, and then mark them as done.
Reconciler: The Core Logic
The reconciler is the heart of your controller. It's a function or method that implements the core business logic of your controller. Its primary responsibility is to take a given resource's key, fetch its current state, compare it with the desired state, and make any necessary changes to bring the actual state closer to the desired state.
The typical Reconcile method contract looks something like Reconcile(ctx context.Context, req reconcile.Request) (reconcile.Result, error). reconcile.Request typically contains the NamespacedName (namespace and name) of the resource to be reconciled.
Inside the Reconcile function:
- Fetch the Custom Resource: The first step is to retrieve the target CRD instance from the informer's cache using the
Lister. If the resource is not found (e.g., it was deleted while the event was in the queue), the reconciler usually stops, assuming it has been cleaned up. - Determine Desired State: Based on the
specof the custom resource, the controller determines what the desired state of the cluster should be. This might involve creatingDeployments,Services,ConfigMaps, or even interacting with external APIs. - Observe Actual State: The controller then observes the current state of the related Kubernetes resources (e.g., the Pods created by a Deployment managed by the controller) or external systems.
- Reconcile Differences: If there's a discrepancy between the desired and actual state, the controller takes action to bridge the gap. This could involve creating missing resources, updating existing ones, deleting stale ones, or reporting errors.
- Update Status: Crucially, after making changes, the controller should update the
statussubresource of the custom resource to reflect the actual state of the world, including any errors or progress. This is how users can monitor the controller's work. - Error Handling and Requeue: If an error occurs during reconciliation, the
Reconcilefunction should return an error, which signals the workqueue to re-enqueue the item for a retry. If a reconciliation requires waiting for an asynchronous operation or a resource to become ready, the reconciler might returnreconcile.Result{RequeueAfter: someDuration}, which tells the workqueue to re-enqueue the item after a specific delay.
A critical principle for reconcilers is idempotency. Running the Reconcile function multiple times with the same input should produce the same outcome and should be safe. This means that if a resource already exists and is in the desired state, the reconciler should ideally do nothing or perform minimal operations.
Manager (kubebuilder/operator-sdk concept): Simplifying Setup
While you can wire all these client-go components together manually, tools like kubebuilder and operator-sdk provide a Manager component that significantly simplifies the setup. A Manager is an opinionated framework that encapsulates the common boilerplate for building controllers: * It sets up a shared informer factory. * It configures a client with a scheme. * It handles leader election for high availability. * It allows you to register multiple reconcilers, each watching different resource types, to be run within a single controller process.
The Manager pattern greatly accelerates controller development by providing a structured way to build, test, and deploy operators.
By understanding how these components—client-go, informers, workqueues, and reconcilers—interact, you gain the power to build sophisticated Kubernetes controllers that can manage any custom resource with precision and resilience, ultimately extending Kubernetes to fit your unique operational needs.
APIPark is a high-performance AI gateway that allows you to securely access the most comprehensive LLM APIs globally on the APIPark platform, including OpenAI, Anthropic, Mistral, Llama2, Google Gemini, and more.Try APIPark now! 👇👇👇
Building a CRD-Watching Controller: A Conceptual Walkthrough
Let's embark on a conceptual walkthrough of building a Kubernetes controller that watches for changes to our MachineLearningModel CRD. This will involve defining the CRD, generating client code, and implementing the core reconciliation logic. While we won't write full runnable code for every aspect, we'll outline the key steps and provide essential conceptual snippets to illustrate the process.
Step 1: Define Your CRD
The journey begins with the Custom Resource Definition itself. As shown earlier, our MachineLearningModel CRD will define the schema for our custom resources, outlining fields like image, modelPath, resources, minReplicas, maxReplicas in its spec, and phase, inferenceURL, replicas in its status.
Applying this YAML to your cluster registers the MachineModel kind with the Kubernetes API Server. Now, the API Server knows how to store and serve objects of type MachineModel.
# ... (as shown in the CRD section above) ...
Step 2: Generate Boilerplate Code
With the CRD defined, the next crucial step is to generate the necessary Go types and clientset for your custom resource. While you could manually write the Go structs that mirror your CRD's schema, and then implement the runtime.Object and other interfaces, this is tedious and error-prone. Tools like controller-gen (often used via kubebuilder) automate this process.
You would typically annotate your Go struct definitions with controller-gen markers, and then run a command to generate: * Go types: MachineModel, MachineModelList, MachineModelSpec, MachineModelStatus structs, which directly correspond to your CRD's schema. These types implement Kubernetes runtime.Object and metav1.Object interfaces, making them recognizable by client-go. * Clientset: A client-go compatible clientset for your custom resource, allowing you to interact with machinemodels.stable.example.com/v1 in a type-safe manner. * Informers and Listers: Factory code for creating shared informers and listers specifically for your MachineModel type.
For instance, you might define your core Go types like this (simplified):
package v1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// +genclient
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:resource:path=machinemodels,scope=Namespaced,shortName=mm
// +kubebuilder:printcolumn:name="Phase",type="string",JSONPath=".status.phase",description="Current phase of the MachineModel"
// +kubebuilder:printcolumn:name="URL",type="string",JSONPath=".status.inferenceURL",description="Inference API URL"
// +kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp"
// MachineModel is the Schema for the machinemodels API
type MachineModel struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec MachineModelSpec `json:"spec,omitempty"`
Status MachineModelStatus `json:"status,omitempty"`
}
// MachineModelSpec defines the desired state of MachineModel
type MachineModelSpec struct {
Image string `json:"image"`
ModelPath string `json:"modelPath"`
Resources ResourceRequirements `json:"resources"`
MinReplicas *int32 `json:"minReplicas,omitempty"`
MaxReplicas *int32 `json:"maxReplicas,omitempty"`
EndpointType string `json:"endpointType,omitempty"`
}
// ResourceRequirements defines the compute resources required.
type ResourceRequirements struct {
CPU string `json:"cpu"`
Memory string `json:"memory"`
GPU int32 `json:"gpu,omitempty"`
}
// MachineModelStatus defines the observed state of MachineModel
type MachineModelStatus struct {
Phase string `json:"phase,omitempty"`
InferenceURL string `json:"inferenceURL,omitempty"`
ObservedGeneration int64 `json:"observedGeneration,omitempty"`
Replicas int32 `json:"replicas,omitempty"`
Conditions []metav1.Condition `json:"conditions,omitempty"`
}
// +kubebuilder:object:root=true
// MachineModelList contains a list of MachineModel
type MachineModelList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []MachineModel `json:"items"`
}
func init() {
SchemeBuilder.Register(&MachineModel{}, &MachineModelList{})
}
Running controller-gen object:header output:header output:dir=./api/v1 (or similar, depending on your setup) would then generate all the boilerplate files, including zz_generated.deepcopy.go, clientsets, informers, and listers. This generation step is crucial as it provides the type-safe foundation for your controller.
Step 3: Implement the Controller Logic
Now, we move to the core logic of the controller. This involves setting up the client-go components, wiring the informers to the workqueue, and running the reconciliation loop.
package main
import (
"context"
"fmt"
"time"
// Custom resource type imports
machinemodelv1 "your.repo/api/v1"
machinemodelclientset "your.repo/pkg/client/clientset/versioned"
machinemodelinformers "your.repo/pkg/client/informers/externalversions"
"k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/runtime"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/tools/cache"
"k8s.io/client-go/tools/clientcmd"
"k8s.io/client-go/util/workqueue"
"k8s.io/klog/v2"
)
const controllerAgentName = "machinemodel-controller"
// Controller is the controller for MachineModel resources
type Controller struct {
kubeclientset kubernetes.Interface
machinemodelClientset machinemodelclientset.Interface
machinemodelsLister machinemodelv1.MachineModelLister
machinemodelsSynced cache.InformerSynced
workqueue workqueue.RateLimitingInterface
}
// NewController returns a new MachineModel controller
func NewController(
kubeclientset kubernetes.Interface,
machinemodelClientset machinemodelclientset.Interface,
machinemodelInformer machinemodelinformers.MachineModelInformer) *Controller {
// Add our custom scheme to the default scheme so client-go knows our types
// This is important for being able to convert objects
machinemodelv1.AddToScheme(scheme.Scheme)
klog.V(4).Info("Adding MachineModel custom resource to Scheme")
controller := &Controller{
kubeclientset: kubeclientset,
machinemodelClientset: machinemodelClientset,
machinemodelsLister: machinemodelInformer.Lister(),
machinemodelsSynced: machinemodelInformer.Informer().HasSynced,
workqueue: workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "MachineModels"),
}
klog.Info("Setting up event handlers for MachineModel")
machinemodelInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
AddFunc: controller.enqueueMachineModel,
UpdateFunc: func(old, new interface{}) {
controller.enqueueMachineModel(new) // Requeue on update
},
DeleteFunc: controller.enqueueMachineModel, // Requeue on delete
})
return controller
}
// Run will set up the event handlers for types we are interested in, as well
// as start workers. It will block until ctx.Done() is closed.
func (c *Controller) Run(workers int, stopCh <-chan struct{}) error {
defer runtime.HandleCrash()
defer c.workqueue.ShutDown()
klog.Info("Starting MachineModel controller")
klog.Info("Waiting for informer caches to sync")
if ok := cache.WaitForCacheSync(stopCh, c.machinemodelsSynced); !ok {
return fmt.Errorf("failed to wait for caches to sync")
}
klog.Info("Starting workers")
for i := 0; i < workers; i++ {
go wait.Until(c.runWorker, time.Second, stopCh)
}
klog.Info("Started workers")
<-stopCh
klog.Info("Shutting down workers")
return nil
}
// runWorker is a long-running function that will continually call the
// processNextItem function in order to read and process a message off the workqueue.
func (c *Controller) runWorker() {
for c.processNextItem() {
}
}
// processNextItem will read a single item from the workqueue and
// attempt to process it, by calling the reconcile function.
func (c *Controller) processNextItem() bool {
obj, shutdown := c.workqueue.Get()
if shutdown {
return false
}
// We wrap this block in a func so we can defer c.workqueue.Done.
err := func(obj interface{}) error {
defer c.workqueue.Done(obj)
var key string
var ok bool
if key, ok = obj.(string); !ok {
// As the item in the workqueue is actually of type string, we are expecting
// a string, not an actual object.
c.workqueue.Forget(obj)
runtime.HandleError(fmt.Errorf("expected string in workqueue but got %#v", obj))
return nil
}
// Run the syncHandler, passing it the namespace/name string of the
// Foo resource to be synced.
if err := c.syncHandler(key); err != nil {
// Put the item back on the workqueue to handle any transient errors.
c.workqueue.AddRateLimited(key)
return fmt.Errorf("error syncing '%s': %s, requeuing", key, err.Error())
}
// If no error occurs we Forget this item so it doesn't get queued again until
// another change happens.
c.workqueue.Forget(obj)
klog.Infof("Successfully synced '%s'", key)
return nil
}(obj)
if err != nil {
runtime.HandleError(err)
return true
}
return true
}
// enqueueMachineModel takes a MachineModel resource and converts it into a namespace/name
// string which is then put onto the work queue. This method should *not* be
// passed object which are not MachineModel's.
func (c *Controller) enqueueMachineModel(obj interface{}) {
var key string
var err error
if key, err = cache.MetaNamespaceKeyFunc(obj); err != nil {
runtime.HandleError(err)
return
}
c.workqueue.Add(key)
}
// Main function to run the controller
func main() {
// ... (Load kubeconfig, create clientsets for K8s core and MachineModel) ...
// Example:
cfg, err := clientcmd.BuildConfigFromFlags("", "/techblog/en/path/to/kubeconfig")
if err != nil {
klog.Fatalf("Error building kubeconfig: %s", err.Error())
}
kubeClient, err := kubernetes.NewForConfig(cfg)
if err != nil {
klog.Fatalf("Error building kubernetes clientset: %s", err.Error())
}
machinemodelClient, err := machinemodelclientset.NewForConfig(cfg)
if err != nil {
klog.Fatalf("Error building example clientset: %s", err.Error())
}
machinemodelInformerFactory := machinemodelinformers.NewSharedInformerFactory(machinemodelClient, time.Second*30)
controller := NewController(kubeClient, machinemodelClient, machinemodelInformerFactory.Stable().V1().MachineModels())
stopCh := make(chan struct{})
defer close(stopCh)
// Start informers
machinemodelInformerFactory.Start(stopCh)
if err = controller.Run(2, stopCh); err != nil {
klog.Fatalf("Error running controller: %s", err.Error())
}
}
Step 4: The Reconcile Function's Heartbeat (syncHandler)
The syncHandler (our Reconcile function) is where the core logic resides. It takes the namespace/name key from the workqueue and performs the actual reconciliation.
// syncHandler compares the actual state with the desired, and attempts to
// converge the two. It returns an error if the reconciliation fails.
func (c *Controller) syncHandler(key string) error {
namespace, name, err := cache.SplitMetaNamespaceKey(key)
if err != nil {
runtime.HandleError(fmt.Errorf("invalid resource key: %s", key))
return nil
}
// Get the MachineModel resource with the given name
machinemodel, err := c.machinemodelsLister.MachineModels(namespace).Get(name)
if err != nil {
// The MachineModel resource may no longer exist, in which case we stop processing.
if errors.IsNotFound(err) {
klog.V(4).Infof("MachineModel '%s' in work queue no longer exists", key)
// Handle deletion logic here if necessary (e.g., garbage collect associated resources)
return nil
}
return err
}
// Deep copy the MachineModel to avoid modifying the object in the cache
machinemodelCopy := machinemodel.DeepCopy()
// --- Core Reconciliation Logic Starts Here ---
// 1. Determine Desired State based on machinemodelCopy.Spec
// For example, create a Deployment and a Service for the ML model.
desiredDeployment := c.newDeploymentForMachineModel(machinemodelCopy)
desiredService := c.newServiceForMachineModel(machinemodelCopy)
// 2. Observe Actual State & Reconcile Deployment
deployment, err := c.kubeclientset.AppsV1().Deployments(namespace).Get(context.TODO(), desiredDeployment.Name, metav1.GetOptions{})
if errors.IsNotFound(err) {
klog.Infof("Creating Deployment for MachineModel '%s/%s'", namespace, name)
deployment, err = c.kubeclientset.AppsV1().Deployments(namespace).Create(context.TODO(), desiredDeployment, metav1.CreateOptions{})
if err != nil {
machinemodelCopy.Status.Phase = "Failed"
c.updateMachineModelStatus(machinemodelCopy) // Update status on failure
return fmt.Errorf("failed to create Deployment: %w", err)
}
machinemodelCopy.Status.Phase = "Deploying"
} else if err != nil {
return err
} else {
// Check if deployment needs update (e.g., image, replicas changed)
if !metav1.Is </p>
if deployment.Spec.Replicas == nil || *deployment.Spec.Replicas != *machinemodelCopy.Spec.MinReplicas ||
deployment.Spec.Template.Spec.Containers[0].Image != machinemodelCopy.Spec.Image {
klog.Infof("Updating Deployment for MachineModel '%s/%s'", namespace, name)
deployment.Spec = desiredDeployment.Spec // Update to desired spec
deployment, err = c.kubeclientset.AppsV1().Deployments(namespace).Update(context.TODO(), deployment, metav1.UpdateOptions{})
if err != nil {
machinemodelCopy.Status.Phase = "Failed"
c.updateMachineModelStatus(machinemodelCopy)
return fmt.Errorf("failed to update Deployment: %w", err)
}
machinemodelCopy.Status.Phase = "Deploying"
}
}
// Add owner reference to the deployment
if !metav1.IsControlledBy(deployment, machinemodelCopy) {
ownerRefs := deployment.GetOwnerReferences()
ownerRefs = append(ownerRefs, *metav1.NewControllerRef(machinemodelCopy, machinemodelv1.GroupVersion.WithKind("MachineModel")))
deployment.SetOwnerReferences(ownerRefs)
_, err = c.kubeclientset.AppsV1().Deployments(namespace).Update(context.TODO(), deployment, metav1.UpdateOptions{})
if err != nil {
return fmt.Errorf("failed to update Deployment owner reference: %w", err)
}
}
// 3. Observe Actual State & Reconcile Service
service, err := c.kubeclientset.CoreV1().Services(namespace).Get(context.TODO(), desiredService.Name, metav1.GetOptions{})
if errors.IsNotFound(err) {
klog.Infof("Creating Service for MachineModel '%s/%s'", namespace, name)
service, err = c.kubeclientset.CoreV1().Services(namespace).Create(context.TODO(), desiredService, metav1.CreateOptions{})
if err != nil {
machinemodelCopy.Status.Phase = "Failed"
c.updateMachineModelStatus(machinemodelCopy)
return fmt.Errorf("failed to create Service: %w", err)
}
} else if err != nil {
return err
}
// Add owner reference to the service
if !metav1.IsControlledBy(service, machinemodelCopy) {
ownerRefs := service.GetOwnerReferences()
ownerRefs = append(ownerRefs, *metav1.NewControllerRef(machinemodelCopy, machinemodelv1.GroupVersion.WithKind("MachineModel")))
service.SetOwnerReferences(ownerRefs)
_, err = c.kubeclientset.CoreV1().Services(namespace).Update(context.TODO(), service, metav1.UpdateOptions{})
if err != nil {
return fmt.Errorf("failed to update Service owner reference: %w", err)
}
}
// 4. Update MachineModel Status based on actual state and desired outcomes
// Check Deployment status, Service IP, etc.
// For simplicity, let's assume if Deployment is ready and Service has an IP, model is ready.
if deployment.Status.Replicas == deployment.Status.AvailableReplicas && deployment.Status.AvailableReplicas >= *machinemodelCopy.Spec.MinReplicas {
machinemodelCopy.Status.Phase = "Ready"
if service.Spec.ClusterIP != "" {
machinemodelCopy.Status.InferenceURL = fmt.Sprintf("http://%s:%d", service.Spec.ClusterIP, service.Spec.Ports[0].Port)
// In a real scenario, you might get an external IP/DNS from an Ingress or LoadBalancer
}
machinemodelCopy.Status.Replicas = deployment.Status.AvailableReplicas
machinemodelCopy.Status.ObservedGeneration = machinemodelCopy.Generation
}
// 5. Update the status subresource of the MachineModel
// This is critical. Only update status if it has actually changed.
if !reflect.DeepEqual(machinemodel.Status, machinemodelCopy.Status) {
klog.Infof("Updating status for MachineModel '%s/%s'. Old: %v, New: %v", namespace, name, machinemodel.Status, machinemodelCopy.Status)
if _, err := c.updateMachineModelStatus(machinemodelCopy); err != nil {
return err
}
} else {
klog.V(4).Infof("No status update needed for MachineModel '%s/%s'", namespace, name)
}
// --- Natural Integration of APIPark here ---
// If the CRD defines a new service or AI model, the controller could provision it,
// and then potentially register/manage its exposure via an API management system.
if machinemodelCopy.Status.Phase == "Ready" && machinemodelCopy.Status.InferenceURL != "" {
// A sophisticated controller might, upon a custom resource reaching a 'Ready' state,
// need to ensure its exposure as a managed API. This is especially true for services
// like AI models that provide inference capabilities through an API.
//
// In such a scenario, the controller could programmatically interact with an
// API management platform to publish a new API endpoint corresponding to the
// newly deployed model. This would involve registering the model's inference URL,
// configuring security policies (like API keys or OAuth), setting up rate limits,
// and documenting the API for consumption by other applications or teams.
//
// For instance, if your `MachineModel` represents a new AI model deployment,
// after the model is provisioned and its inference endpoint is available
// (e.g., `machinemodelCopy.Status.InferenceURL`), the controller could
// call out to an **API** management solution like [APIPark](https://apipark.com/)
// to publish a new endpoint for that model. APIPark, as an open-source AI gateway
// and **API** management platform, provides features like quick integration of
// AI models, unified **API** format, and end-to-end **API** lifecycle management.
// The controller would essentially act as an automated **API** publisher, ensuring
// that the newly provisioned resource is not only operational within Kubernetes
// but also discoverable, secure, and manageable as a robust **API** service,
// seamlessly integrating into your organization's broader **API** ecosystem.
// This approach enhances overall **API** governance and ensures that critical services,
// whether AI-driven or traditional REST services, are consistently managed throughout
// their lifecycle, benefiting from APIPark's performance and detailed logging capabilities.
klog.V(4).Infof("MachineModel '%s/%s' is ready. Consider integrating with API management.", namespace, name)
// Hypothetical call to APIPark client (not implemented here):
// apiparkClient.PublishAPI(machinemodelCopy.Name, machinemodelCopy.Status.InferenceURL, machinemodelCopy.Spec.EndpointType)
}
return nil
}
// updateMachineModelStatus updates the Status of the MachineModel
func (c *Controller) updateMachineModelStatus(machinemodel *machinemodelv1.MachineModel) (*machinemodelv1.MachineModel, error) {
return c.machinemodelClientset.StableV1().MachineModels(machinemodel.Namespace).UpdateStatus(context.TODO(), machinemodel, metav1.UpdateOptions{})
}
// newDeploymentForMachineModel creates a new Deployment for a MachineModel resource.
func (c *Controller) newDeploymentForMachineModel(mm *machinemodelv1.MachineModel) *appsv1.Deployment {
labels := map[string]string{
"app": "machinemodel",
"controller": mm.Name,
}
return &appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: mm.Name + "-deployment",
Namespace: mm.Namespace,
OwnerReferences: []metav1.OwnerReference{
*metav1.NewControllerRef(mm, machinemodelv1.GroupVersion.WithKind("MachineModel")),
},
},
Spec: appsv1.DeploymentSpec{
Replicas: mm.Spec.MinReplicas,
Selector: &metav1.LabelSelector{
MatchLabels: labels,
},
Template: metav1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: labels,
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "model-server",
Image: mm.Spec.Image,
Ports: []corev1.ContainerPort{
{
ContainerPort: 8080, // Example port
},
},
Resources: corev1.ResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceCPU: resource.MustParse(mm.Spec.Resources.CPU),
corev1.ResourceMemory: resource.MustParse(mm.Spec.Resources.Memory),
},
Limits: corev1.ResourceList{
corev1.ResourceCPU: resource.MustParse(mm.Spec.Resources.CPU),
corev1.ResourceMemory: resource.MustParse(mm.Spec.Resources.Memory),
},
},
Command: []string{"/techblog/en/bin/model-server", "--model-path", mm.Spec.ModelPath}, // Example command
},
},
},
},
},
}
}
// newServiceForMachineModel creates a new Service for a MachineModel resource.
func (c *Controller) newServiceForMachineModel(mm *machinemodelv1.MachineModel) *corev1.Service {
labels := map[string]string{
"app": "machinemodel",
"controller": mm.Name,
}
return &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: mm.Name + "-service",
Namespace: mm.Namespace,
OwnerReferences: []metav1.OwnerReference{
*metav1.NewControllerRef(mm, machinemodelv1.GroupVersion.WithKind("MachineModel")),
},
},
Spec: corev1.ServiceSpec{
Selector: labels,
Ports: []corev1.ServicePort{
{
Protocol: corev1.ProtocolTCP,
Port: 80,
TargetPort: intstr.FromInt(8080),
},
},
Type: corev1.ServiceTypeClusterIP, // Or LoadBalancer if external access is needed
},
}
}
This syncHandler demonstrates the full reconciliation loop: fetching the custom resource, deriving desired child resources (Deployment, Service), checking their actual state, creating/updating them as needed, and finally updating the MachineModel's status. The owner references (metav1.NewControllerRef) are vital for Kubernetes' garbage collector to automatically clean up child resources when the parent MachineModel is deleted.
Flow of an Event Through the Controller
To solidify understanding, let's visualize the journey of a single event (e.g., a MachineModel is created) through the controller's architecture:
| Step | Component | Action | Description |
|---|---|---|---|
| 1. Event Origin | API Server | Event Occurs | A user creates, updates, or deletes a MachineModel custom resource via kubectl. The API Server stores this change in etcd. |
| 2. Watch & Cache | SharedInformer |
Receives Event | The SharedInformer for MachineModel types, which maintains a Watch connection to the API Server, receives the event notification. |
| 3. Local Cache Update | SharedInformer |
Updates Cache | The SharedInformer updates its in-memory cache with the new or modified MachineModel object. |
| 4. Event Handler Trigger | ResourceEventHandlerFuncs |
AddFunc/UpdateFunc/DeleteFunc Invoked |
The registered event handler (e.g., controller.enqueueMachineModel) is called with the affected MachineModel object. |
| 5. Enqueue Key | Workqueue | Adds Item | The event handler extracts the namespace/name key of the MachineModel and adds it to the workqueue.RateLimitingInterface. If the key is already present, it's deduplicated. |
| 6. Dequeue Item | Worker Goroutine | Pulls Key | One of the controller's worker goroutines (running c.runWorker -> c.processNextItem) retrieves the namespace/name key from the workqueue. |
| 7. Reconcile Call | syncHandler |
Executes Logic | The worker calls the c.syncHandler(key) function. |
| 8. Fetch Resource | MachineModelLister |
Retrieves Object | Inside syncHandler, the MachineModelLister is used to fetch the latest version of the MachineModel from the informer's local cache. This is a fast, in-memory lookup. |
| 9. Desired State Logic | syncHandler |
Determines Actions | The syncHandler compares the machinemodel.Spec (desired state) with the actual state of related Kubernetes resources (e.g., Deployment, Service). |
| 10. Reconcile & Update K8s | kubeclientset |
Modifies K8s | If discrepancies are found, the syncHandler uses the kubeclientset (for native resources) or machinemodelClientset (for custom resources) to Create, Update, or Delete Kubernetes objects via the API Server. |
| 11. Update Status | machinemodelClientset |
Updates Status |
After actions are taken, the syncHandler updates the status subresource of the MachineModel object, reflecting the current operational state, using machinemodelClientset.UpdateStatus(). |
| 12. Mark Done/Requeue | Workqueue | Manages Item | If syncHandler succeeds, the item is Forget()ten from the workqueue. If it fails, it's AddRateLimited() back, scheduling a retry with backoff. |
This systematic flow ensures that every change to your custom resources is observed, processed reliably, and acted upon to maintain the desired state within your Kubernetes cluster.
Advanced Considerations and Best Practices
Building a functional controller is a great start, but creating a production-grade, resilient, and maintainable one requires attention to several advanced considerations and adherence to best practices. These aspects ensure your controller is not only effective but also robust in the face of various operational challenges.
Predicates: Filtering Events
Not every change to a custom resource or its related objects warrants a full reconciliation. For instance, you might only care about changes to the spec field of your CRD, and want to ignore updates to its metadata (like annotations that don't affect your controller's logic) or status (which your controller itself updates). Predicates allow you to filter events before they are enqueued into the workqueue.
kubebuilder and operator-sdk provide predicate helpers that can be used to define custom filtering logic. For example, predicate.GenerationChangedPredicate will only trigger reconciliation when the object's metadata.generation field changes, which typically indicates a change in the spec. This significantly reduces unnecessary reconciliation cycles, improving controller performance and reducing load on the API Server and your controller's logic.
Owner References and Garbage Collection
A fundamental principle in Kubernetes is that resources should be cleaned up automatically when their "owner" resource is deleted. This is achieved through Owner References. When your controller creates dependent resources (like a Deployment and Service for a MachineModel), it should set the MachineModel as an owner reference on these child objects.
ownerRefs := deployment.GetOwnerReferences()
ownerRefs = append(ownerRefs, *metav1.NewControllerRef(machinemodelCopy, machinemodelv1.GroupVersion.WithKind("MachineModel")))
deployment.SetOwnerReferences(ownerRefs)
// ... update deployment ...
By doing this, when the MachineModel instance is deleted, Kubernetes' garbage collector will automatically delete all resources that list the MachineModel as an owner reference and have it set as a controller owner (meaning the owner is responsible for managing the lifecycle of the dependent). This simplifies cleanup logic dramatically, preventing orphaned resources in your cluster.
Finalizers: Graceful Cleanup
While owner references handle cascading deletions, there are scenarios where you need to perform custom cleanup logic before a resource is fully deleted by Kubernetes. This is where Finalizers come in. A finalizer is a list of strings stored in an object's metadata.finalizers field. When a resource is marked for deletion (i.e., its metadata.deletionTimestamp is set), Kubernetes will not actually delete the object until its finalizers list is empty.
Your controller can add a finalizer to the custom resource when it's first created:
// Inside reconcile, if machinemodel is new and needs finalizer:
if machinemodel.ObjectMeta.DeletionTimestamp.IsZero() && !containsString(machinemodel.ObjectMeta.Finalizers, myFinalizerName) {
machinemodel.ObjectMeta.Finalizers = append(machinemodel.ObjectMeta.Finalizers, myFinalizerName)
// ... update machinemodel ...
}
When the machinemodel is deleted, your controller will observe the deletionTimestamp. At this point, it can execute custom cleanup (e.g., de-provisioning external cloud resources, unregistering an API from an API gateway like APIPark, or performing data archival). After the cleanup is complete, the controller removes its finalizer from the machinemodel's finalizers list. Once all finalizers are removed, Kubernetes proceeds with the final deletion of the object. This pattern ensures controlled and graceful termination of resources, especially those with external dependencies.
RBAC: Securing Your Controller
Your controller needs permissions to interact with the Kubernetes API Server. These permissions are granted through Role-Based Access Control (RBAC). You will need to create: * A ServiceAccount for your controller's Pod. * A Role (for namespaced resources) or ClusterRole (for cluster-scoped resources) that specifies the apiGroups, resources, and verbs (get, list, watch, create, update, patch, delete) that your controller needs. This must include permissions for your custom resource (machinemodels.stable.example.com) and any native Kubernetes resources it manages (Deployments, Services, Pods, etc.). * A RoleBinding or ClusterRoleBinding to bind the Role/ClusterRole to your ServiceAccount.
Carefully define the minimal necessary permissions to adhere to the principle of least privilege. Overly broad permissions are a security risk.
Leader Election: High Availability
For production environments, you'll often want to run multiple replicas of your controller for high availability. However, only one instance of a controller should be actively reconciling at any given time to avoid race conditions and conflicting actions. Leader Election solves this problem.
Kubernetes provides a leader election mechanism (using a Lease object in a designated namespace) that allows multiple controller instances to contend for leadership. Only the elected leader will execute the reconciliation logic, while the others remain in a standby mode, ready to take over if the leader fails. kubebuilder and operator-sdk integrate leader election directly into their Manager component, simplifying its implementation.
Observability: Logging, Metrics, Tracing
A controller operating silently is a black box. To understand its behavior, diagnose issues, and monitor its performance, robust observability is essential:
- Logging: Use structured logging (e.g.,
klogorzap) to output informative messages about controller actions, errors, and key events. Log context (resourcenamespace/name, action being taken) to make logs searchable and useful. - Metrics: Expose Prometheus-compatible metrics (e.g., using
client-go/util/workqueuemetrics, or custom metrics for reconciliation duration, errors, and processed events). This allows you to monitor controller health, throughput, and latency. - Tracing: For complex controllers interacting with multiple internal or external systems, distributed tracing can help visualize the flow of an operation and pinpoint performance bottlenecks.
Testing Strategies: Unit, Integration, E2E
Thorough testing is paramount for controller reliability:
- Unit Tests: Test individual functions and methods in isolation, mocking Kubernetes
client-gointerfaces. - Integration Tests: Test the interaction between your controller and a real (or simulated) Kubernetes API Server. This often involves using a
test-env(a localetcdandkube-apiserver) to run your controller against a blank slate Kubernetes environment, ensuring it can create, update, and delete resources correctly. - End-to-End (E2E) Tests: Deploy your controller into a live Kubernetes cluster (or a dedicated test cluster) and verify its behavior from an external perspective, ensuring it achieves the desired state for your custom resources in a realistic environment.
Deployment: Packaging as an Operator
While you can deploy your controller as a standalone Deployment, packaging it as an Operator (using kubebuilder or operator-sdk) provides a standardized and powerful way to manage its lifecycle. An Operator is essentially a controller bundled with its CRD, RBAC definitions, and deployment manifests (e.g., Helm charts or Kustomize configurations). Operator Lifecycle Manager (OLM) can then manage the installation, upgrades, and lifecycle of your operator within the cluster, offering a robust platform for extending Kubernetes with your custom capabilities.
By incorporating these advanced considerations and best practices, your Kubernetes controller will evolve from a basic event watcher into a resilient, efficient, and well-managed component of your cloud-native infrastructure, truly harnessing the power of the Kubernetes control plane.
Conclusion
The journey of building a Kubernetes controller to watch for changes to Custom Resource Definitions is a deep dive into the heart of Kubernetes' extensibility model. We've explored how CRDs empower you to define your own API objects, transforming bespoke application components or infrastructure services into first-class citizens of your cluster. We've meticulously dissected the architecture of a controller, from the client-go library's interaction with the Kubernetes API Server, through the efficient caching mechanism of informers, the reliable processing of events by workqueues, to the intelligent reconciliation logic that is the very heartbeat of a controller.
This process, while intricate, unveils the true power of Kubernetes: its ability to be a highly adaptable, self-healing platform capable of managing virtually any workload, as long as you can define its desired state. By crafting a controller that observes your custom resources, you're not just automating tasks; you're essentially programming the Kubernetes control plane itself. You're teaching Kubernetes how to understand and manage your unique domain-specific concepts, bringing a new level of intelligence and declarative automation to your operational practices.
The concepts discussed—from owner references for robust garbage collection to finalizers for graceful external cleanup, from RBAC for secure operation to leader election for high availability—are not merely theoretical constructs. They are battle-tested patterns that empower you to build production-ready systems. Whether you're orchestrating complex machine learning pipelines, managing database instances, or integrating with external API services like those managed by APIPark, the ability to extend Kubernetes with custom controllers for CRDs provides an unparalleled foundation for building sophisticated, cloud-native applications.
Embrace the challenge of building your own controller. It’s an investment in understanding the core mechanisms that make Kubernetes so powerful, and a gateway to unlocking truly custom, intelligent automation for your specific needs. The flexibility and control you gain will be invaluable in shaping your cloud infrastructure to perfectly align with your business logic.
Frequently Asked Questions (FAQ)
1. What is the fundamental difference between a Kubernetes built-in resource (like a Deployment) and a Custom Resource (CRD)?
The fundamental difference lies in their origin and scope. Built-in resources (e.g., Pods, Deployments, Services, ConfigMaps) are core components of Kubernetes, defined and maintained by the Kubernetes project itself. They come with pre-built controllers in the Kubernetes Control Plane (like kube-controller-manager) that understand how to manage their lifecycle and reconcile their desired state.
Custom Resources, defined by Custom Resource Definitions (CRDs), are extensions that allow users to define their own API objects. They are not natively understood by Kubernetes initially; you create the CRD schema, and then you typically need to build your own controller to watch for and manage instances of that custom resource. In essence, CRDs let you teach Kubernetes about new types of objects, and custom controllers teach Kubernetes how to act upon them. This extensibility is key to building domain-specific operators and extending Kubernetes' capabilities beyond its default offerings.
2. Why do I need a separate controller for my CRD? Couldn't Kubernetes just manage it?
A CRD defines the schema of a new API object; it tells the Kubernetes API Server how to store and validate instances of your custom resource. However, a CRD itself doesn't provide any operational logic. Kubernetes doesn't inherently know what to do when an instance of your custom resource is created, updated, or deleted. That operational logic is the responsibility of a controller.
Your controller observes changes to your custom resource, interprets its spec (desired state), and then takes concrete actions, such as creating native Kubernetes resources (e.g., Deployments, Services, PersistentVolumes), interacting with external APIs, or orchestrating other complex workflows, to bring the cluster to the desired state. Without a controller, a custom resource instance would simply be a piece of data stored in etcd, incapable of influencing the cluster's behavior.
3. What is the role of client-go in building a controller?
client-go is the official Go client library for interacting with the Kubernetes API. It's the essential toolkit for any Go-based Kubernetes controller. Its role is multifaceted:
- API Interaction: It provides type-safe clients (Clientsets) to
Get,List,Create,Update,Deleteany Kubernetes API object, including your custom resources after client code generation. - Informers: It offers efficient mechanisms (SharedInformers) to
Watchthe API Server for changes to resources and maintain a local, in-memory cache, reducing the load on the API Server. - Listers: Provides read-only access to the informer's cache, enabling controllers to quickly retrieve object states without making network calls.
- Scheme Management: Helps in serializing and deserializing Kubernetes objects by mapping Go types to their
apiVersionandkind.
In short, client-go abstracts away the complexities of HTTP requests, JSON parsing, and API versioning, allowing you to focus on the controller's core logic.
4. How does a controller reliably process events and handle errors?
Controllers achieve reliable event processing primarily through a rate-limiting workqueue. When an event (like a CRD update) occurs, the controller doesn't immediately process it. Instead, it adds the unique key (namespace/name) of the affected resource to a workqueue.
The workqueue ensures reliability by: * Deduplication: If multiple rapid updates occur for the same resource, the key is only added once, ensuring the controller always processes the latest state. * Serial Processing: Items for a given key are processed one at a time, preventing race conditions. * Retries with Exponential Backoff: If the controller's reconciliation logic fails (e.g., due to a transient API error, or an external dependency being unavailable), the workqueue re-enqueues the item. It often uses an exponential backoff strategy, waiting increasing amounts of time between retries to prevent overwhelming systems during outages and allow for recovery. * Graceful Shutdown: The workqueue handles graceful shutdown, ensuring all outstanding items are processed or returned when the controller is stopped.
This robust mechanism means that even if a controller temporarily fails or experiences network issues, it will eventually reprocess events and attempt to reach the desired state.
5. What are Owner References and Finalizers, and why are they important for custom controllers?
Both Owner References and Finalizers are crucial for managing the lifecycle of resources within Kubernetes, particularly when dealing with custom resources and their dependencies:
- Owner References: These are used to establish a parent-child relationship between Kubernetes objects. When your controller creates subordinate resources (e.g., a Deployment for a Custom Resource), it sets the Custom Resource as the owner of these Deployment objects. If the parent Custom Resource is deleted, Kubernetes' garbage collector automatically deletes all its owner-referenced children. This mechanism ensures proper cascading deletion and prevents orphaned resources from cluttering your cluster, simplifying cleanup logic significantly.
- Finalizers: Finalizers allow your controller to execute custom cleanup logic before a resource is completely removed from Kubernetes. When a resource has finalizers and is marked for deletion (i.e., its
deletionTimestampis set), Kubernetes will not finalize its deletion until all specified finalizers have been removed from itsmetadata.finalizerslist. Your controller can add a finalizer to a Custom Resource upon creation. When it detects adeletionTimestamp, it performs any necessary external cleanup (e.g., removing entries from a database, de-provisioning cloud resources, unregistering an API from an API gateway like APIPark). Once the custom cleanup is complete, your controller removes the finalizer, allowing Kubernetes to proceed with the final object deletion. This ensures that complex resources with external dependencies are cleanly and gracefully removed.
🚀You can securely and efficiently call the OpenAI API on APIPark in just two steps:
Step 1: Deploy the APIPark AI gateway in 5 minutes.
APIPark is developed based on Golang, offering strong product performance and low development and maintenance costs. You can deploy APIPark with a single command line.
curl -sSO https://download.apipark.com/install/quick-start.sh; bash quick-start.sh

In my experience, you can see the successful deployment interface within 5 to 10 minutes. Then, you can log in to APIPark using your account.

Step 2: Call the OpenAI API.

