K8s Core

Learn Kubernetes & OpenShift from control plane to production ops

Internals-first guides for developers, platform engineers, and architects—what the API server does with every kubectl apply, what etcd stores, what the kubelet executes on each node, and how OpenShift extends and hardens Kubernetes for enterprise environments.

What problem does Kubernetes solve?

Containers solved packaging. They did not solve running hundreds of them across dozens of servers: scheduling, self-healing when a node dies, rolling updates without downtime, service discovery without hardcoded IPs, and autoscaling when traffic spikes.

The pre-K8s world: SSH into servers, start containers manually, write brittle shell scripts for restarts, load-balance with nginx configs full of IP addresses, and pray nothing fails during a deploy. One server crash meant manual intervention at 3 AM.

Kubernetes is a control plane that continuously reconciles desired state (YAML in git) with actual state (running pods on nodes). Think of it as an airport control tower: you declare where planes should go; the tower schedules, routes, and reroutes when runways close—without you micromanaging each aircraft.

The pre-Kubernetes world

Before orchestration, every team reinvented the same fragile patterns. Kubernetes replaced ad-hoc scripts with declarative APIs and reconciliation loops.

📦 Real World

Google ran Borg internally for a decade before open-sourcing Kubernetes in 2014. Spotify, Airbnb, and major banks migrated from bespoke schedulers and Mesos to K8s because the ecosystem, hiring pool, and CNCF governance made it the default platform layer.

Kubernetes vs Docker Swarm vs Nomad

All three schedule containers. Kubernetes won the market through extensibility (CRDs, operators), ecosystem depth, and cloud provider investment—not because it was the simplest option.

Dimension Kubernetes Docker Swarm HashiCorp Nomad
Complexity High—control plane, many moving parts Low—built into Docker Engine Medium—single binary scheduler
Extensibility CRDs, operators, admission webhooks, 1000+ integrations Limited—no CRD equivalent Task drivers, limited K8s overlap
Ecosystem Helm, ArgoCD, Istio, Prometheus, cert-manager, Cluster API Declining—Docker Inc. focus shifted Strong in HashiCorp stacks (Consul, Vault)
Enterprise adoption Default on every cloud; OpenShift, EKS, GKE, AKS Legacy deployments only Niche—batch + mixed workloads
Best fit Platform teams, microservices, GitOps, multi-tenant clusters Simple multi-node Docker Compose upgrade path VM + container + batch unified scheduler
⚖️ Trade-off

Swarm and Nomad are valid for small, simple deployments. Kubernetes complexity pays off when you need RBAC, network policies, operators, multi-cluster GitOps, or a hiring pool of engineers who already know K8s.

🎯 Interview Tip

If asked "why Kubernetes over alternatives?", cite declarative reconciliation, extensible API (CRDs), portable workload definitions, and CNCF ecosystem—not "Google uses it."

OpenShift vs vanilla Kubernetes

OpenShift Container Platform (OCP) is Red Hat's enterprise Kubernetes distribution. Same core API—different defaults, added operators, stricter security, and integrated developer tooling.

  • Minimal control plane — API server, etcd, scheduler, controller-manager, kubelet, kube-proxy
  • Bring your own — ingress controller, registry, monitoring, CI/CD, identity provider
  • Flexible defaults — Pod Security Admission optional; root containers allowed unless restricted
  • Ingress only — no built-in Route resource; choose nginx, Traefik, ALB, etc.
  • Community support — CNCF, vendor support via cloud managed services (EKS, GKE, AKS)
  • Web console — developer and admin UIs for deploy, logs, metrics, operators
  • Built-in CI/CD — OpenShift Pipelines (Tekton), OpenShift GitOps (ArgoCD operator)
  • ImageStreams — track image tags, trigger deploys on image changes
  • Routes — HAProxy-based ingress native to OCP; wildcard *.apps.<cluster>
  • SCCs — Security Context Constraints (stricter than PSA by default)
  • OperatorHub + OLM — curated operator catalog with lifecycle management
  • RHCOS — immutable CoreOS nodes updated by Machine Config Operator, not SSH + yum
  • Enterprise support — Red Hat SLA, FIPS, compliance certifications
🔴 OpenShift

What OCP restricts: no root containers by default (restricted-v2 SCC), stricter network policy defaults, no arbitrary hostPath mounts, SCC admission rejects pods that don't match granted constraints. Editions: OCP (self-managed), ROSA (AWS), ARO (Azure), RHOCP on GCP, MicroShift (edge/IoT).

⚠️ Pitfall

Manifests that work on vanilla K8s often fail on OpenShift with unable to validate against any security context constraint. Fix: run as non-root random UID, drop capabilities, or grant a specific SCC to the ServiceAccount—not privileged in production.

Kubernetes ecosystem map

Core K8s is the foundation. Production clusters layer packaging, GitOps, service mesh, observability, certificates, DNS, and cluster lifecycle tools on top.

Core platform

  • kube-apiserver + etcd — state store and API gateway
  • scheduler + controllers — placement and reconciliation loops
  • CNI + CSI — pod networking and persistent storage

Packaging

  • Helm — templated charts, release management
  • Kustomize — overlay-based config, native kubectl apply -k

GitOps

  • ArgoCD — declarative sync, UI, ApplicationSets
  • Flux — modular GitOps toolkit controllers

Service mesh

  • Istio / Linkerd — mTLS, traffic management, observability
  • OpenShift Service Mesh — supported Istio distribution on OCP

Observability

  • Prometheus + Grafana — metrics and dashboards
  • Fluent Bit / Vector → Loki — log aggregation
  • OpenTelemetry + Tempo/Jaeger — distributed tracing

Platform glue

  • cert-manager — TLS certificate automation
  • External-DNS — sync Ingress/Route to DNS
  • Cluster API — declarative cluster lifecycle
💡 Pro Tip

Don't install everything day one. Start with metrics-server, an ingress controller, and cert-manager. Add GitOps when manual kubectl apply becomes a bottleneck.

Kubernetes timeline K8s 1.28+ CKA/CKAD/CKS

From Google internal Borg lessons to the CNCF graduated standard every cloud runs.

  1. 2014

    Google open-sources Kubernetes

    Announced at DockerCon. Built on Borg/Omega experience—declarative APIs, reconciliation loops, label selectors.

  2. 2015

    K8s v1.0 · CNCF founding

    Production-ready API. Kubernetes donated to newly formed Cloud Native Computing Foundation.

  3. 2016

    DaemonSets · StatefulSets

    Per-node workloads and stable network identity for stateful apps—foundation for operators.

  4. 2018

    RBAC GA · CRDs GA

    Production RBAC replaces ABAC. CustomResourceDefinitions enable the operator ecosystem.

  5. 2019

    Ingress graduates to beta

    Standard HTTP routing resource—still requires external ingress controller.

  6. 2020

    Server-side apply

    kubectl apply --server-side—field ownership tracking, safer multi-controller merges.

  7. 2021

    PodSecurityPolicy deprecated

    PSP removed in 1.25. Pod Security Admission (PSA) replaces it as built-in admission controller.

  8. 2022

    PSA · ephemeral containers GA

    Namespace-level security profiles (privileged/baseline/restricted). Debug running pods without restart.

  9. 2023

    Sidecar containers KEP

    Init containers with restartPolicy: Always—proper sidecar lifecycle (K8s 1.29+).

  10. 2024

    In-place pod resize (alpha)

    Adjust CPU/memory requests without pod restart—reduces churn for VPA workflows.

OpenShift timeline OCP 4.16+

From PaaS platform to enterprise Kubernetes distribution with operators and immutable infrastructure.

  1. 2011

    OpenShift v1 (Origin)

    Red Hat PaaS with custom cartridge model—predates wide Kubernetes adoption.

  2. 2015

    OCP 3 — Kubernetes-based

    Rebuilt on upstream K8s. DeploymentConfig, Routes, S2I builds become OCP differentiators.

  3. 2018

    OCP 4 — CoreOS · Operators

    RHCOS immutable nodes. Cluster Version Operator manages upgrades. OperatorHub launches.

  4. 2020

    ROSA on AWS

    Red Hat OpenShift Service on AWS—managed OCP with Red Hat SRE operations.

  5. 2022

    OCP 4.10 · ACM integration

    Advanced Cluster Management for multi-cluster policy and application delivery at scale.

  6. 2024

    OCP 4.16 · Hosted control planes

    HyperShift-based hosted control planes—faster, cheaper multi-cluster provisioning on ROSA/AWS.

kubectl vs oc

oc is a superset of kubectl—every kubectl command works in oc, plus OpenShift-specific commands for Routes, Builds, ImageStreams, and projects.

CLI
terminal — deploy & expose
$ kubectl create deployment web --image=nginx:1.25 --replicas=3
$ kubectl expose deployment web --port=80 --type=ClusterIP
$ kubectl get pods -l app=web -w$ oc new-app nginx:1.25 --name=web
→ Deployment, Service, and Route created automatically
$ oc expose svc/web
$ oc get route web
🔬 Under the Hood

Both CLI tools are thin clients. Every command becomes an HTTPS request to the API server— GET /api/v1/namespaces/default/pods for kubectl get pods. Authentication via kubeconfig certs or token; authorization via RBAC before any change persists to etcd.

💡 Pro Tip

Use kubectl explain pod.spec.containers.resources (or oc explain) to explore API fields without leaving the terminal—essential for CKA/CKAD prep.

Control plane at a glance

Every cluster change flows through the API server to etcd. The scheduler assigns pods; controllers reconcile; kubelets on worker nodes make containers actually run.

kubectl / oc

Declarative YAML → HTTPS API request

API Server

Auth → RBAC → admission → validate → etcd

etcd

Single source of truth — Raft consensus

Scheduler + Controllers

Assign nodes · reconcile desired state

kubelet → CRI

Pull image · start container · report status

flowchart TB
  subgraph cp["Control plane"]
    API["kube-apiserver"]
    ETCD["etcd\nRaft quorum"]
    SCH["kube-scheduler"]
    CM["kube-controller-manager"]
  end
  subgraph worker["Worker node"]
    KL["kubelet"]
    KP["kube-proxy"]
    CRI["containerd / CRI-O"]
    P["Pod containers"]
  end
  CLI["kubectl / oc"] --> API
  API --> ETCD
  API --> SCH
  API --> CM
  SCH --> API
  CM --> API
  KL --> API
  KL --> CRI
  CRI --> P
  KP --> P
⚠️ Pitfall

etcd space quota exceeded (default 2GB) causes API write failures cluster-wide. Monitor etcd_server_has_leader and disk usage. Regular etcdctl snapshot save backups are non-negotiable for production.

⚙️ Config

Production etcd: 3 or 5 nodes (odd for quorum), dedicated SSD, <10ms latency. Never run workloads on control plane nodes in production clusters.

Explore the guide — all sections

Fifteen deep-dive chapters plus cheat sheets. Recommended path: ArchitectureWorkloadsNetworkingRBAC, then GitOps and production ops as your role requires.

Learning path: Architecture · Workloads · Networking · Storage · RBAC

Developer

developer

Deployments, Services, ConfigMaps, debugging pods, resource requests/limits, logs, events, kubectl/oc daily workflow.

DevOps / Platform Engineer

devops

Cluster admin, RBAC, networking policies, storage classes, ingress, Helm, operators, upgrades, observability stack.

Architect / Tech Lead

architect

Multi-cluster strategy, PSA/SCC posture, service mesh, GitOps architecture, OpenShift vs K8s trade-offs, DR and capacity planning.