RBAC, Security & Identity

Kubernetes has no built-in user database—identity comes from certificates, bearer tokens, or external IdPs. RBAC decides who can do what; admission (PSA, SCC, OPA, Kyverno) decides what workloads may run. On OpenShift, OAuth, LDAP group sync, and Security Context Constraints layer on top of vanilla K8s. This chapter covers authentication through audit logging—the full security perimeter from API request to pod admission.

devops architect K8s 1.29+ OCP 4.16+
CLI

Authentication

The API server must answer who is this caller? before RBAC answers may they? Kubernetes does not ship a user registry—admins integrate external identity or issue certificates and tokens.

No built-in users

User objects do not exist in the Kubernetes API. Human identities are strings (e.g. jane@corp.com) produced by an authenticator and referenced in RBAC bindings. Group is likewise a string list—system:authenticated, system:masters, or LDAP/OIDC-mapped groups.

Mechanism Typical use Notes
Client certificates Admin kubeconfig, component mTLS CN becomes username; O= becomes groups. Rotated via CSR API (certificates.k8s.io)
Bearer tokens ServiceAccount tokens, static secrets Presented in Authorization: Bearer header
OIDC Enterprise SSO (Azure AD, Okta, Keycloak) Configured on API server: --oidc-issuer-url, --oidc-client-id, claim mappings
Webhook token auth Custom token validation API server POSTs TokenReview to external service; returns user + groups
ServiceAccount In-cluster workload identity Namespace-scoped; bound to Roles via RoleBinding

ServiceAccount token auto-mount

By default, every pod gets a projected or legacy Secret volume mounting the pod's ServiceAccount token at /var/run/secrets/kubernetes.io/serviceaccount/token. The token authenticates as that ServiceAccount when calling the API—powerful if the SA has broad RBAC.

TokenRequest API

Prefer bound tokens via TokenRequest (subresource of ServiceAccount): audience-scoped, time-limited, and revocable when the pod is deleted. Projected volumes with expirationSeconds rotate automatically—no long-lived Secret objects.

Disable automount

For workloads that never call the API, disable token injection at two levels:

  • automountServiceAccountToken: false on the ServiceAccount
  • automountServiceAccountToken: false on the Pod spec (overrides SA default)
yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: batch-worker
  namespace: etl
automountServiceAccountToken: false
---
apiVersion: v1
kind: Pod
metadata:
  name: worker
  namespace: etl
spec:
  serviceAccountName: batch-worker
  automountServiceAccountToken: false
  containers:
    - name: worker
      image: my-registry/etl:2.1
terminal — who am I?
$ kubectl auth whoami
$ kubectl get serviceaccount default -o yaml
$ kubectl create token default -n app --duration=1h --audience=api
$ kubectl get --raw /apis/authentication.k8s.io/v1/tokenreviews \
  -d '{"apiVersion":"authentication.k8s.io/v1","kind":"TokenReview","spec":{"token":"TOKEN"}}'$ oc whoami
$ oc whoami --show-token=false
$ oc create token default -n app --duration=1h
$ oc get oauth cluster -o yaml
🔒 Security

Legacy ServiceAccount token Secrets (type kubernetes.io/service-account-token) are long-lived and not audience-bound. Audit clusters for auto-created secrets; migrate to projected tokens and set legacy-service-account-token-tracking feature awareness in upgrades.

🔬 Under the Hood

Authenticators run in order until one succeeds. OpenShift inserts an OAuth authenticator ahead of client cert auth—the oc login flow obtains an OAuth access token exchanged for a kube API bearer token.

⚠️ Pitfall

Granting cluster-admin to a default ServiceAccount in a namespace effectively gives every pod in that namespace full cluster control. Use dedicated SAs per deployment with minimal RoleBindings.

RBAC

Role-Based Access Control maps subjects (users, groups, ServiceAccounts) to roles via bindings. Rules are tuples of apiGroups, resources, verbs, and optional resourceNames.

Core objects

Kind Scope Purpose
Role Namespace Permissions within one namespace
ClusterRole Cluster Cluster-wide or reusable template bound per-namespace
RoleBinding Namespace Links Role or ClusterRole to subjects in that namespace
ClusterRoleBinding Cluster Links ClusterRole to subjects cluster-wide

Subjects, verbs, resources, names

  • Subjects: User, Group, ServiceAccount (kind + name + namespace for SA)
  • Verbs: get, list, watch, create, update, patch, delete, deletecollection, plus subresource verbs like escalate, bind
  • Resources: plural API resource names (pods, deployments); use pods/log for subresources
  • resourceNames: restrict rule to specific object names—cannot combine with list/watch on collections

Aggregated ClusterRoles

aggregationRule on a ClusterRole merges rules from other ClusterRoles matching label selectors. Built-in roles like admin, edit, view aggregate CRDs and controller-specific permissions as operators install.

Least privilege

Start from view or a custom read-only Role; add verbs incrementally. Namespace-scoped Roles beat ClusterRoleBindings for app teams. Avoid wildcard rules (resources: ["*"], verbs: ["*"]) except in tightly controlled automation SAs.

flowchart TD
  REQ["API request\n(verb + resource + namespace)"] --> AUTH{"Authenticated?"}
  AUTH -->|No| E401["401 Unauthorized"]
  AUTH -->|Yes| SUB["Resolve user, groups, SA"]
  SUB --> BIND["Collect RoleBindings +\nClusterRoleBindings for subject"]
  BIND --> RULES["Union rules from bound Roles"]
  RULES --> MATCH{"Any rule matches?\napiGroup + resource + verb\n+ namespace scope"}
  MATCH -->|Yes| ALLOW["RBAC: allow → admission chain"]
  MATCH -->|No| E403["403 Forbidden"]
  ALLOW --> ADM["Mutating / Validating admission"]
yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: deploy-reader
  namespace: payments
rules:
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods", "pods/log"]
    verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: deploy-reader-binding
  namespace: payments
subjects:
  - kind: ServiceAccount
    name: ci-deploy
    namespace: payments
  - kind: Group
    name: payments-devs
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: deploy-reader
terminal — RBAC debugging
$ kubectl auth can-i create deployments -n payments
$ kubectl auth can-i delete pods --as=system:serviceaccount:payments:ci-deploy -n payments
$ kubectl auth can-i --list -n payments
$ kubectl get role,rolebinding,clusterrole,clusterrolebinding -A | grep payments
$ kubectl describe clusterrole edit$ oc auth can-i create deployments -n payments
$ oc auth can-i --list -n payments
$ oc adm policy who-can delete pods -n payments
$ oc describe clusterrole admin
🎯 Interview Tip

Explain the difference between Role + RoleBinding (namespace-local) vs ClusterRole + RoleBinding (cluster role, namespace-scoped binding)—the latter is how you grant CRD permissions per team namespace without duplicating rule sets.

⚙️ Config

OpenShift default ClusterRoleBindings map OAuth groups to cluster-admin, cluster-reader, etc. After LDAP sync, verify oc get clusterrolebinding—stale bindings are a common post-migration gap.

⚖️ Trade-off

Many small Roles are harder to audit but safer. One "namespace-admin" ClusterRole per team is simpler but tends to accumulate * verbs over time. Prefer generated Roles from CI (Terraform/Helm) with code review.

Pod Security

PodSecurityPolicy (PSP) was removed in Kubernetes 1.25. Pod Security Admission (PSA) is the built-in replacement—namespace labels select a profile and mode. Policy engines (OPA Gatekeeper, Kyverno) add custom rules on top.

PSP deprecated

PSP was an admission plugin requiring cluster-wide PSP objects and RBAC to use them—operationally brittle. Migrate to PSA labels plus SCC on OpenShift; recreate PSP constraints as Kyverno/Gatekeeper policies if you need exceptions.

PSA profiles

Profile Posture Examples blocked
privileged Unrestricted (system namespaces) Nothing—equivalent to old unrestricted PSP
baseline Minimally restrictive, blocks known privilege escalations Privileged containers, hostPath, hostNetwork, CAP_SYS_ADMIN
restricted Hardened pod hardening standard Must run as non-root, drop ALL caps, seccomp RuntimeDefault, read-only root FS (where enforced)

Modes: enforce, audit, warn

  • enforce — reject violating pods (hard gate)
  • audit — allow but emit audit log event
  • warn — allow but return warning to client (kubectl stderr)

Rollout pattern: warnauditenforce per namespace tier.

Namespace labels

yaml
apiVersion: v1
kind: Namespace
metadata:
  name: payments
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

OPA Gatekeeper & Kyverno

PSA covers pod security standards only. Gatekeeper (OPA) and Kyverno enforce org policy: required labels, banned image registries, ingress TLS, resource limits, label mutating defaults. They run as admission webhooks—synchronous, so keep policies fast and scoped with namespace selectors.

terminal — PSA labels
$ kubectl label namespace payments \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/warn=restricted --overwrite
$ kubectl get ns payments --show-labels
$ kubectl auth can-i use podsecuritypolicies -A 2>/dev/null || echo "PSP removed"
$ # Dry-run violating pod
kubectl apply -f bad-pod.yaml --dry-run=server$ oc label namespace payments \
  pod-security.kubernetes.io/enforce=restricted --overwrite
$ oc get ns payments --show-labels
$ oc get constraints 2>/dev/null || oc get validatingwebhookconfigurations | grep -i gatekeeper
📦 Real World

Platform teams label kube-system and openshift-* as privileged, staging namespaces as baseline, and production app namespaces as restricted. Kyverno mutates Deployments to add runAsNonRoot when developers forget.

⚠️ Pitfall

restricted blocks common patterns: legacy images running as root, init containers needing chmod on root FS, and sidecars without seccomp profiles. Test with --dry-run=server before enforcing.

💡 Pro Tip

PSA exemptions exist for critical workloads via the API server's PodSecurity configuration—use sparingly for node-critical DaemonSets, not as a developer escape hatch.

OpenShift Security Context Constraints (SCC)

SCC is OpenShift's pod admission gate for security context—user ID ranges, capabilities, volumes, and host access. It predates PSA and remains authoritative on OCP; pods must be admitted by an SCC and satisfy PSA namespace labels.

Common SCC profiles

SCC Allows Typical workload
restricted-v2 Non-root UID, dropped caps, no host access Default for most OCP 4.x namespaces (replaces restricted)
nonroot-v2 Any non-root UID in allowed range Images with arbitrary numeric users
anyuid Run as root (UID 0) Legacy vendor images—avoid in new apps
privileged Full host-like privileges Node exporters, CNI, storage daemons
hostnetwork-v2 hostNetwork: true with constraints DNS, monitoring agents binding host ports

Admission

SCC admission runs after authentication/RBAC. The pod's ServiceAccount must be able to use an SCC whose constraints match the pod's securityContext. OpenShift selects the most restrictive applicable SCC that still admits the pod.

Common SCC errors

  • unable to validate against any security context constraint — no SCC matches; SA lacks use on a fitting SCC
  • runAsUser: Invalid value 0 — pod requests root but only restricted-v2 applies
  • host volumes are not allowed — hostPath blocked unless privileged or custom SCC
  • capabilities — adding NET_BIND_SERVICE may require nonroot-v2 or custom SCC

Grant SCC to ServiceAccount

terminal — oc adm policy
$ # SCC is OpenShift-only — no kubectl equivalent
$ kubectl get pods -n legacy -o yaml | grep -A3 securityContext$ oc get scc
$ oc describe scc restricted-v2
$ oc adm policy add-scc-to-user anyuid -z legacy-sa -n legacy
$ oc adm policy remove-scc-from-user anyuid -z legacy-sa -n legacy
$ oc get pod failing-pod -n legacy -o yaml | grep -i scc
$ oc auth can-i use scc/anyuid --as=system:serviceaccount:legacy:legacy-sa

SCC vs PSA

PSA evaluates pod spec against Kubernetes Pod Security Standards via namespace labels. SCC additionally enforces OpenShift UID ranges (runAsUser strategy), SELinux contexts, volume types, and seccomp defaults. A pod can pass SCC but fail PSA restricted, or vice versa—test both.

🔴 OpenShift

Default ServiceAccounts in new projects get restricted-v2. The anyuid grant is cluster-admin territory—document every grant in change tickets; prefer fixing images to run non-root.

🔒 Security

privileged SCC is effectively root on the node. Restrict who can bind it via RBAC on the securitycontextconstraints resource and audit oc adm policy changes.

🎯 Interview Tip

When asked "pod won't start on OpenShift," walk through: Events → SCC admission message → oc describe scc → SA SCC grants → PSA namespace labels → securityContext in pod spec.

LDAP / Active Directory Integration

OpenShift ships an integrated OAuth server—the console and oc login authenticate against LDAP/AD (or OIDC, HTPasswd). Kubernetes RBAC consumes groups synced from LDAP; users are not stored in etcd.

OAuth server & oauth-config

Cluster OAuth config lives in the OAuth cluster resource (config.openshift.io). Identity providers are defined under spec.identityProviders—type LDAP with bind DN, URL, and attribute mappings.

yaml
# Fragment — OAuth cluster spec.identityProviders (LDAP)
- name: corp-ldap
  mappingMethod: claim
  type: LDAP
  ldap:
    url: ldaps://ldap.corp.example:636
    bindDN: cn=ocp-bind,ou=svc,dc=corp,dc=example
    bindPassword:
      name: ldap-bind-password
    insecure: false
    ca:
      name: ldap-ca
    attributes:
      id: ["dn"]
      email: ["mail"]
      name: ["cn"]
      preferredUsername: ["uid"]

LDAP attributes

Attribute key Purpose
id Unique identity for the OAuth token (often DN)
preferredUsername Shown in oc whoami and audit logs
email / name Console display and contact

Group sync — oc adm groups sync

LDAP groups map to OpenShift Group objects via a sync config file. Cron or an operator runs oc adm groups sync to reconcile membership—RBAC bindings reference OpenShift group names, not LDAP DNs directly.

yaml
apiVersion: v1
kind: LDAPSyncConfig
url: ldaps://ldap.corp.example:636
bindDN: cn=ocp-bind,ou=svc,dc=corp,dc=example
bindPassword: "/etc/secrets/ldap/bindPassword"
ca: /etc/secrets/ldap/ca.crt
insecure: false
rfc2307:
  groupsQuery:
    baseDN: ou=groups,dc=corp,dc=example
    scope: sub
    derefAliases: never
    filter: (objectClass=groupOfNames)
  groupUIDAttribute: dn
  groupNameAttributes: [cn]
  groupMembershipAttributes: [member]
  usersQuery:
    baseDN: ou=users,dc=corp,dc=example
    scope: sub
    derefAliases: never
  userUIDAttribute: dn
  userNameAttributes: [uid]
  tolerateMembersNotFound: false
  tolerateMembersOutOfScope: false
terminal — LDAP / groups
$ # Vanilla K8s: configure OIDC on API server flags or use webhook token auth
$ kubectl config set-credentials oidc-user \
  --auth-provider=oidc \
  --auth-provider-arg=idp-issuer-url=https://login.corp.example \
  --auth-provider-arg=client-id=kubernetes$ oc get oauth cluster -o yaml
$ oc login -u jane --server=https://api.ocp.corp:6443
$ oc adm groups sync --sync-config=ldap-sync.yaml --confirm
$ oc get groups | grep payments
$ oc adm policy add-cluster-role-to-group edit payments-devs
⚙️ Config

mappingMethod: claim embeds groups from the IdP token; lookup queries LDAP on login. Large AD forests often use groups sync + mappingMethod: lookup for consistent RBAC group names.

⚠️ Pitfall

Renaming LDAP groups orphan ClusterRoleBindings. Automate sync and treat OpenShift Group names as the RBAC contract—not raw LDAP CNs.

📦 Real World

Banks often sync AD security groups to OCP groups nightly, map ocp-prod-admin to cluster-admin (break-glass only), and map ocp-payments-edit to a custom Role on the payments namespace.

Audit Logging

API audit logs record who did what, when, and whether it succeeded—essential for compliance, forensics, and debugging RBAC denials. Configure an audit policy on the API server and ship events to durable backends.

Audit policy levels

Level Logged
None Skip (not logged)
Metadata Request metadata—user, verb, resource, response code; no body
Request Metadata + request body (secrets may be redacted by policy)
RequestResponse Metadata + request and response bodies (verbose, storage-heavy)

Backends

  • Log file--audit-log-path with rotation (--audit-log-maxage, --audit-log-maxbackup)
  • Webhook — stream to SIEM (Splunk, Elastic, Loki ingest)
  • Batch — buffer and flush (trade latency for throughput)
yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  - level: None
    users: ["system:kube-proxy", "system:kubelet"]
    verbs: ["get", "watch", "list"]
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets", "configmaps"]
  - level: RequestResponse
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: "rbac.authorization.k8s.io"
  - level: Metadata
    omitStages: [RequestReceived]
  - level: Request
    namespaces: ["payments", "cards"]

OpenShift audit log paths

On OCP, API server audit logs are written on control plane nodes under /var/log/kube-apiserver/audit.log (path may vary by release). Use oc adm node-logs to fetch without SSH.

terminal — audit logs
$ # On self-managed clusters — tail API server audit log (path varies)
sudo tail -f /var/log/kubernetes/audit/audit.log | jq -c '.user.username,.verb,.objectRef'
$ # Filter RBAC changes
grep 'rbac.authorization.k8s.io' /var/log/kubernetes/audit/audit.log | tail -20$ oc adm node-logs --role=master --path=kube-apiserver/audit.log | tail -50
$ oc adm node-logs master-0 --path=kube-apiserver/audit.log | grep payments-devs
$ oc get cluster -o yaml | grep -A5 audit
🔬 Under the Hood

Audit events are emitted per request stage: RequestReceived, ResponseStarted, ResponseComplete. Use omitStages to reduce noise. PSA audit mode writes policy violations to the audit stream.

⚖️ Trade-off

RequestResponse on all resources fills disks fast and may capture Secret data despite redaction rules. Default to Metadata cluster-wide; escalate sensitive namespaces or RBAC mutations only.

💡 Pro Tip

Correlate audit auditID with API server tracing and ingress logs for incident timelines. For denied requests, look for responseStatus.code: 403 and the user.username field.