Observability
Know what your systems are doing in production—metrics for aggregates, logs for events, traces for request paths, and SLOs so alerts mean customer pain, not noise. Connects to Kubernetes day-2 and CI/CD gates. Back to DevOps.
Guides
Five guides from observability concepts through Prometheus, logs, tracing, and SLO-based on-call.
-
Guide
Observability explained
Metrics, logs, traces, golden signals, SLIs and SLOs, error budgets, and how observability fits CI/CD and Kubernetes.
-
Guide
Metrics & Prometheus
Instrument apps, scrape configs, PromQL, recording rules, and Grafana dashboards for RED metrics.
-
Guide
Logs & centralized logging
Structured logging, aggregation, retention, correlation with traces, and Kubernetes log collection.
-
Guide
Distributed tracing
OpenTelemetry SDK, context propagation, Jaeger or Tempo, and tracing HTTP services in K8s.
-
Guide
SLOs, alerting & on-call
Alert design, burn-rate alerts, runbooks, paging policy, and tying SLOs to deployment gates.