Observability

Know what your systems are doing in production—metrics for aggregates, logs for events, traces for request paths, and SLOs so alerts mean customer pain, not noise. Connects to Kubernetes day-2 and CI/CD gates. Back to DevOps.

Guides

Five guides from observability concepts through Prometheus, logs, tracing, and SLO-based on-call.

  • Guide

    Observability explained

    Metrics, logs, traces, golden signals, SLIs and SLOs, error budgets, and how observability fits CI/CD and Kubernetes.

  • Guide

    Metrics & Prometheus

    Instrument apps, scrape configs, PromQL, recording rules, and Grafana dashboards for RED metrics.

  • Guide

    Logs & centralized logging

    Structured logging, aggregation, retention, correlation with traces, and Kubernetes log collection.

  • Guide

    Distributed tracing

    OpenTelemetry SDK, context propagation, Jaeger or Tempo, and tracing HTTP services in K8s.

  • Guide

    SLOs, alerting & on-call

    Alert design, burn-rate alerts, runbooks, paging policy, and tying SLOs to deployment gates.