Operators & CRDs

The Operator Pattern

An operator is a custom controller paired with one or more Custom Resources (CRs). You declare intent in YAML (spec); the operator continuously reconciles cluster state toward that intent—installing, upgrading, backing up, failing over, and cleaning up—just as a human SRE would.

CRD + controller = operator

Built-in controllers (Deployment, StatefulSet) already follow this model: you set spec.replicas: 3; the Deployment controller creates ReplicaSets and Pods until reality matches. Operators extend the same reconciliation loop to application domains Kubernetes does not natively understand.

Piece	Role	Analogy
CRD	Registers a new API type with the API server	Schema definition — "what fields does a PostgresCluster have?"
Custom Resource (CR)	User's desired state instance	kubectl apply of your app intent
Controller	Watches CRs; creates/updates/deletes operands	The SRE who never sleeps
Operand	Native K8s objects the operator manages	StatefulSets, Services, Secrets, PVCs, Jobs

Production operator examples

Operator	CR example	What it automates
PostgreSQL CloudNativePG / Crunchy	PostgresCluster, pgcluster	Replication, failover, backup/restore, rolling upgrades, connection pooling
Kafka Strimzi / AMQ Streams	Kafka, KafkaTopic, KafkaUser	Broker clusters, topic ACLs, TLS certs, rolling restarts, rack awareness
cert-manager	Certificate, ClusterIssuer	ACME/Let's Encrypt, private CA, cert renewal, Secret injection into Ingress/Route
Prometheus Prometheus Operator	Prometheus, ServiceMonitor, Alertmanager	Scrape config generation, rule management, HA Prometheus pairs, Alertmanager clustering

$ kubectl get crd | grep -E 'postgres|kafka|cert-manager|monitoring'
$ kubectl api-resources --api-group=postgresql.cnpg.io
$ kubectl get postgrescluster -A
$ kubectl get certificate -A
$ kubectl get servicemonitor -n monitoring$ oc get crd | grep -E 'postgres|kafka|cert-manager|monitoring'
$ oc get csv -A | grep -i postgres
$ oc get kafkas.kafka.strimzi.io -A
$ oc get clusterissuer
$ oc get packagemanifest -n openshift-marketplace | grep -i cert

📦 Real World

Platform teams rarely run raw Postgres StatefulSets in production. They install CloudNativePG or Crunchy Postgres Operator via OLM, then hand developers a PostgresCluster CR. Backup schedules, replication, and version upgrades become declarative—reviewed in Git, applied by ArgoCD, reconciled by the operator.

⚖️ Trade-off

Operators add operational power but also dependency risk: CRD schema changes, abandoned projects, and upgrade ordering (operator before CR, or vice versa) can block cluster upgrades. Prefer CNCF-graduated or Red Hat-certified operators for production stateful tiers; keep escape hatches (Velero backups, managed DB fallback).

🎯 Interview Tip

"What is an operator?" — A controller + CRD that encodes domain-specific ops knowledge. It watches Custom Resources and reconciles native K8s objects (and external systems) to match desired state. Contrast with Helm: Helm renders manifests once; operators run continuously and handle day-2 operations (backup, failover, cert renewal).

CustomResourceDefinitions (CRDs)

CRDs register new API types with kube-apiserver via the apiextensions.k8s.io API group. Once established, users create instances with kubectl apply—the same workflow as built-in resources.

Extending the Kubernetes API

Every API resource has a GroupVersionKind (GVK). A CRD declares the group (e.g. cache.example.com), version(s), scope (Namespaced or Cluster), and schema. The API server stores CR instances in etcd alongside Pods and Deployments.

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: caches.cache.example.com   # <plural>.<group>
spec:
  group: cache.example.com
  scope: Namespaced
  names:
    plural: caches
    singular: cache
    kind: Cache
    shortNames:
      - ch
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              required: [replicas, memory]
              properties:
                replicas:
                  type: integer
                  minimum: 1
                  maximum: 10
                memory:
                  type: string
                  pattern: '^[0-9]+(Mi|Gi)$'
            status:
              type: object
              properties:
                readyReplicas:
                  type: integer
                conditions:
                  type: array
                  items:
                    type: object
                    properties:
                      type:
                        type: string
                      status:
                        type: string
                        enum: ["True", "False", "Unknown"]
                      reason:
                        type: string
                      message:
                        type: string
      subresources:
        status: {}
      additionalPrinterColumns:
        - name: Replicas
          type: integer
          jsonPath: .spec.replicas
        - name: Ready
          type: integer
          jsonPath: .status.readyReplicas
        - name: Age
          type: date
          jsonPath: .metadata.creationTimestamp

OpenAPI schema validation

Since Kubernetes 1.16+, structural schemas are required for apiextensions.k8s.io/v1 CRDs. The API server rejects CRs that violate the schema at admission time—before they reach etcd. Use kubectl explain cache.spec to introspect fields (works like built-in resources).

Version lifecycle: v1alpha1 → v1

CRD versions follow the same maturity path as core APIs. Start with v1alpha1 (experimental, may break), graduate to v1beta1 (more stable), then v1 (GA). Multiple versions can be served: true simultaneously; exactly one is storage: true (the version persisted in etcd).

Version	Stability	Typical use
v1alpha1	Experimental; breaking changes allowed	Early operator development, internal clusters only
v1beta1	Beta; field deprecation with notice	Community operators, pre-GA releases
v1	GA; backward-compatible guarantees	Production CRDs, certified operators

Conversion webhooks

When multiple versions are served, the API server may need to convert between storage and requested versions. A conversion webhook (spec.conversion.strategy: Webhook) calls your service to translate v1beta1 ↔ v1 fields. Without conversion, only one version can be served.

Status subresource

Splitting spec (user intent) from status (observed state) is a Kubernetes best practice. Enabling subresources.status: {} lets controllers update status via /status without triggering full-object reconciliation loops or conflicting with user spec edits.

Finalizers

metadata.finalizers block CR deletion until the controller removes them—ensuring cleanup (delete cloud volumes, revoke certs, drain Kafka partitions) completes before the object disappears. A stuck finalizer leaves the CR in Terminating state indefinitely.

$ kubectl apply -f cache-crd.yaml
$ kubectl get crd caches.cache.example.com
$ kubectl explain cache.spec
$ kubectl get caches -o wide
$ kubectl describe cache my-cache
→ check Events for schema validation failures
$ kubectl patch cache my-cache --type=merge -p '{"metadata":{"finalizers":[]}}'
→ last resort: remove stuck finalizer (data loss risk)$ oc apply -f cache-crd.yaml
$ oc get crd caches.cache.example.com -o yaml
$ oc explain cache.spec
$ oc get caches -o wide
$ oc adm inspect crd/caches.cache.example.com --dest-dir=/tmp/inspect

⚠️ Pitfall

CRD schema changes are mostly immutable once established. Adding optional fields is usually safe; changing field types, removing fields, or tightening validation can reject existing CRs. Plan version bumps with conversion webhooks instead of in-place schema surgery on live production CRDs.

🔧 Under the Hood

When you kubectl apply a CR, the API server validates against the CRD's OpenAPI schema, runs admission webhooks (validating/mutating), stores in etcd at /registry/<group>/<resource>/..., and notifies watchers. The operator's informer receives the event and enqueues a reconcile request—same machinery as the built-in Deployment controller.

⚙️ Config

Use additionalPrinterColumns so kubectl get shows useful columns without -o yaml. Map status conditions to printer columns for ops dashboards. Keep shortNames short (2–4 chars) for interactive CLI use.

Writing an Operator

You can write controllers from scratch, but Operator SDK and controller-runtime provide informers, work queues, leader election, and client abstractions—the same libraries powering built-in controllers.

Operator SDK workflows

Workflow	Language / runtime	Best for
Go (Kubebuilder)	Go + controller-runtime	Performance, complex logic, most production operators (Strimzi, cert-manager, Prometheus Operator)
Ansible	Ansible playbooks in a container	Teams with Ansible expertise; simpler reconcile via idempotent tasks
Helm	Helm chart rendered per reconcile	Operators that mostly deploy Helm charts; less suited to complex state machines

controller-runtime reconcile loop

The core pattern: watch resources → enqueue reconcile key → reconcile until desired state matches actual state → requeue on error or periodic resync. Idempotency is mandatory—the same reconcile may run many times for one spec change.

flowchart TB
  W["Informer watches\nCR + owned resources"] --> Q["Work queue"]
  Q --> R["Reconcile(req)"]
  R --> G{"Get CR\nfrom API"}
  G -->|NotFound| D["Done — object deleted"]
  G -->|Found| C{"Finalizer\non delete?"}
  C -->|Deleting| CL["Run cleanup\nremove finalizer"]
  C -->|Active| S["Read spec\ncompare to actual"]
  S --> A["Create/Update/Patch\noperands (STS, SVC, CM)"]
  A --> U["Update status\nconditions + generation"]
  U --> OK{"Error?"}
  OK -->|No| Q2["Requeue after interval\nor done"]
  OK -->|Yes| E["Requeue with backoff"]
  CL --> D
  E --> Q
  Q2 --> Q

Owner references

Set ownerReferences on child objects (StatefulSet, Service, ConfigMap) pointing to the parent CR. Benefits: garbage collection deletes children when the CR is deleted; secondary watches re-enqueue the parent when a child changes. Use controllerutil.SetControllerReference() in Go.

Conditions

Report observable state in status.conditions—mirroring built-in resources like Node and Deployment. Standard pattern: Ready, Progressing, Degraded with type, status (True/False/Unknown), reason, message, and lastTransitionTime.

Generation tracking

metadata.generation increments on every spec change. Store status.observedGeneration in your CR status. When observedGeneration < generation, reconciliation is stale—useful for "RollingUpgradeInProgress" conditions and avoiding false Ready signals during spec updates.

func (r *CacheReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    var cache cachev1.Cache
    if err := r.Get(ctx, req.NamespacedName, &cache); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // Handle deletion + finalizer
    if !cache.DeletionTimestamp.IsZero() {
        if controllerutil.ContainsFinalizer(&cache, finalizerName) {
            if err := r.cleanup(ctx, &cache); err != nil {
                return ctrl.Result{}, err
            }
            controllerutil.RemoveFinalizer(&cache, finalizerName)
            return ctrl.Result{}, r.Update(ctx, &cache)
        }
        return ctrl.Result{}, nil
    }

    // Ensure finalizer on create
    if !controllerutil.ContainsFinalizer(&cache, finalizerName) {
        controllerutil.AddFinalizer(&cache, finalizerName)
        return ctrl.Result{}, r.Update(ctx, &cache)
    }

    // Reconcile operands
    sts := r.statefulSetFor(&cache)
    if err := controllerutil.SetControllerReference(&cache, sts, r.Scheme); err != nil {
        return ctrl.Result{}, err
    }
    if err := r.createOrUpdate(ctx, sts); err != nil {
        return ctrl.Result{}, err
    }

    // Update status
    cache.Status.ReadyReplicas = *sts.Spec.Replicas
    cache.Status.ObservedGeneration = cache.Generation
    meta.SetStatusCondition(&cache.Status.Conditions, metav1.Condition{
        Type:   "Ready",
        Status: metav1.ConditionTrue,
        Reason: "Reconciled",
    })
    return ctrl.Result{RequeueAfter: 5 * time.Minute}, r.Status().Update(ctx, &cache)
}

$ operator-sdk init --domain example.com --repo github.com/you/cache-operator
$ operator-sdk create api --group cache --version v1 --kind Cache --resource --controller
$ make install   # apply CRDs to cluster
$ make run       # run controller locally against cluster
$ make docker-build docker-push IMG=quay.io/you/cache-operator:v0.1.0
$ make deploy IMG=quay.io/you/cache-operator:v0.1.0
$ kubectl logs -n cache-operator-system deployment/cache-operator-controller-manager -f$ operator-sdk init --domain example.com --repo github.com/you/cache-operator
$ make install && make run
$ oc new-project cache-operator
$ make deploy IMG=image-registry.openshift-image-registry.svc:5000/cache-operator/controller:v0.1.0
$ oc logs -n cache-operator deployment/cache-operator-controller-manager -f
$ oc adm policy add-scc-to-user anyuid -z cache-operator-controller-manager
→ only if operator image requires non-default UID

💡 Pro Tip

Run controllers locally with make run during development—it uses your kubeconfig and speeds iteration. Use envtest for unit tests without a real cluster. Enable leader election (LeaderElection: true) before deploying multiple replicas.

🔒 Security

Operators run with cluster-wide or namespace-scoped RBAC—often powerful. Apply least privilege: grant only verbs on resources the operator owns. Use kube-rbac-proxy sidecar (scaffolded by Operator SDK) to protect metrics/debug endpoints. On OpenShift, verify SCC compatibility before production deploy.

🔧 Under the Hood

controller-runtime's Manager coordinates shared informer caches, metrics, health probes (/healthz, /readyz), and graceful shutdown. The work queue deduplicates bursts of events—ten Pod updates enqueue one reconcile for the parent CR.

Operator Maturity Model

The Operator Capability Levels framework (popularized by Operator Framework) grades operators on a 1–5 scale—from "installs the app" to "fully autonomous day-2 operations." Use it to set expectations with vendors and internal teams.

Level	Name	Capabilities	Example
1	Basic Install	Deploy application; minimal config via CR spec	Helm-based operator that creates Deployment + Service
2	Seamless Upgrades	Level 1 + patch/minor version upgrades, rolling updates	App operator with version field; triggers rolling image bump
3	Full Lifecycle	Level 2 + backup/restore, scaling, credential rotation	Postgres operator with Backup CR, failover, restore jobs
4	Deep Insights	Level 3 + metrics, alerts, dashboards, status conditions	Prometheus Operator exposing ServiceMonitors; Kafka exporter metrics
5	Auto Pilot	Level 4 + horizontal/vertical tuning, anomaly response, auto-remediation	Experimental: auto-scale based on query latency; self-heal corruption

Evaluating operators for procurement

Ask the vendor which level they claim—and which CRs prove it (backup CR? upgrade CR?)
Level 3+ is typical minimum for production stateful services (databases, messaging)
Level 5 is rare; treat marketing claims skeptically until demonstrated in your failure scenarios
Red Hat Certified Operators must document supported upgrade paths—often Level 2–3 minimum

⚖️ Trade-off

Higher maturity levels mean more CRD surface area and more controller complexity—harder to debug when things go wrong. A Level 1 operator you understand may beat a Level 4 black box for simple stateless apps. Match maturity to operational requirements, not ambition.

🎯 Interview Tip

"How do you evaluate a third-party operator?" — Check maturity level, CRD stability (v1 vs v1alpha1), OLM upgrade history, backup/restore story, multi-AZ failover, resource footprint, and whether status conditions expose actionable errors. Run a game-day: kill the leader pod, fill the disk, revoke a cert—does the operator recover without manual steps?

📦 Real World

Strimzi (Kafka) and CloudNativePG are widely considered Level 3–4: upgrades, backup, monitoring integration, and rich status. cert-manager reaches Level 3+ with automated renewal. Many internal "operators" are Level 1 Helm wrappers—fine for dev, insufficient for production databases.

OLM (Operator Lifecycle Manager)

OLM installs, upgrades, and manages operators—and their CRDs, RBAC, and webhooks—through a pipeline of Custom Resources. It ships built-in on OpenShift; installable on vanilla K8s via Operator Framework. OperatorHub is the discovery UI and catalog index.

ClusterServiceVersion (CSV)

A ClusterServiceVersion is the operator's install manifest bundle: Deployment spec, owned CRD definitions, required RBAC, webhook configurations, dependency constraints, and maturity metadata. OLM transitions CSV phases: Pending → Installing → Succeeded (or Failed).

OperatorHub

OperatorHub aggregates PackageManifest entries from CatalogSource objects. On OpenShift, default catalogs include redhat-operators, redhat-marketplace, and community-operators. Platform teams can publish private catalogs via index images.

Subscription channels

Each operator package exposes channels (e.g. stable, fast, candidate) mapping to CSV version lines. A Subscription pins the desired channel; OLM upgrades the installed CSV when a newer version appears in that channel—unless blocked by startingCSV or manual approval.

flowchart LR
  OH["OperatorHub\n(console browse)"] --> CS["CatalogSource\n(gRPC index)"]
  CS --> PM["PackageManifest"]
  SUB["Subscription\nchannel + source"] --> IP["InstallPlan"]
  IP -->|Manual approval| AP["Admin approves"]
  AP --> CSV["CSV applied\nCRDs + RBAC + Deployment"]
  IP -->|Automatic| CSV
  CSV --> OP["Operator pod running"]
  OP --> CR["User creates CRs"]

InstallPlan manual approval

Set installPlanApproval: Manual on the Subscription for change-controlled environments. OLM creates an InstallPlan in Pending state; a cluster admin reviews the CSV and dependency list, then approves: spec.approved: true. Automatic approval applies immediately—faster but riskier during platform upgrades.

apiVersion: operators.coreos.com/v1alpha1
kind: OperatorGroup
metadata:
  name: postgres-operators
  namespace: operators
spec:
  targetNamespaces:
    - databases
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: crunchy-postgres-operator
  namespace: operators
spec:
  channel: v5
  name: crunchy-postgres-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  installPlanApproval: Manual
  startingCSV: postgresoperator.v5.5.0

OLM CR	Purpose
CatalogSource	Points to operator bundle index image; serves PackageManifest over gRPC
OperatorGroup	Defines namespace scope for operator watch; creates required RBAC
Subscription	Declares desired operator, channel, catalog; triggers InstallPlan
InstallPlan	Lists CSVs to install/upgrade; gated by manual or automatic approval
ClusterServiceVersion	Operator deployment + owned CRDs + permissions + maturity metadata

$ # OLM on vanilla K8s — install operator-framework first
$ kubectl get csv -A
$ kubectl get subscription,installplan -A
$ kubectl describe installplan <name> -n operators
→ approve: kubectl patch installplan <name> -n operators --type merge -p '{"spec":{"approved":true}}'
$ operator-sdk run bundle quay.io/operator/cache-operator-bundle:v0.1.0$ oc get packagemanifest -n openshift-marketplace | grep postgres
$ oc apply -f operatorgroup.yaml -f subscription.yaml
$ oc get installplan -n operators
$ oc patch installplan <name> -n operators --type merge -p '{"spec":{"approved":true}}'
$ oc get csv -n operators
→ PHASE: Succeeded required before creating application CRs
$ oc describe csv postgresoperator.v5.5.0 -n operators
$ oc get subscription crunchy-postgres-operator -n operators -o yaml
$ oc get catalogsource -n openshift-marketplace

🔴 OpenShift

OpenShift 4 installs OLM by default in openshift-operator-lifecycle-manager. The console OperatorHub UI creates Subscriptions with one click—equivalent to applying YAML. Platform upgrades may bump certified operator channels; review oc get subscription -A before approving cluster upgrades. Many cluster operators (CVO-managed) are themselves OLM-style controllers.

⚠️ Pitfall

InstallPlan stuck Pending — check installPlanApproval: Manual and approve explicitly. CSV Failed — inspect status.message for missing CRDs, RBAC conflicts, or webhook cert issues. Subscription not upgrading — verify channel name matches catalog; startingCSV may block auto-upgrade.

⚙️ Config

Production pattern: installPlanApproval: Manual + pin startingCSV in Git; ArgoCD or a pipeline approves InstallPlans after review. Document the approved CSV version per OCP/K8s minor version in your platform runbook.

💡 Pro Tip

Before uninstalling an operator, delete all instance CRs first—otherwise operands and finalizers may orphan resources. OLM's delete on CSV triggers cleanup, but application-level finalizers run in the operator controller, not OLM itself.