Infrastructure & IaC explained

Applications need more than containers—they need networks, DNS, databases, IAM roles, and clusters. Infrastructure as Code (IaC) describes that footprint in version-controlled files so every environment is reproducible, reviewable, and aligned with how you already ship apps via DevSec Core and Kubernetes.

Helpful background: basic cloud vocabulary (VPC, subnet, security group) and comfort with git PR reviews.

After reading, you should be able to:

Step 1 — What “infrastructure” means on a cloud team

LayerExamplesOften owned by
FoundationVPC, subnets, routing, NAT, VPNPlatform / infra engineers
ComputeEKS/GKE/AKS cluster, node groups, autoscalingPlatform + SRE
DataRDS, S3, Redis, message queuesInfra + app teams
IdentityIAM roles, OIDC for CI, service accountsSecurity + platform
App runtimeKubernetes Deployments, Helm releasesApp teams (GitOps / CI)

IaC usually owns the top four layers; Kubernetes manifests or Helm often live in adjacent repos but follow the same review-and-apply discipline.

Step 2 — Manual changes vs declarative IaC

Click-ops in a cloud console is fast for experiments and dangerous for production: no diff review, no audit trail tied to git, and “what is actually running?” becomes a mystery after staff turnover.

Declarative IaC says: “Here is the desired end state.” The tool figures out create/update/delete steps.

IaC flow from HCL desired state through plan and apply to cloud resources and state file.
Terraform-style workflow: plan shows the delta; apply reconciles; state remembers cloud IDs for the next run.

For beginners: Think of IaC like a recipe checked into git—anyone can reproduce the same cake (VPC + cluster) in staging and prod with different ingredient sizes (variables).

For experienced readers: Declarative reconciliation differs from imperative scripts (AWS CLI loops); tools compute dependency graphs and parallelize safe creates.

Step 3 — Terraform vocabulary (used throughout this track)

We use Terraform syntax in examples; OpenTofu is an open-source fork with compatible workflows. Pulumi and CDK use real programming languages but share the same ideas (desired state, diff, deploy).

ConceptRole
ProviderPlugin that talks to AWS, Azure, GCP, GitHub, etc.
ResourceOne infrastructure object (aws_vpc, aws_eks_cluster)
Data sourceRead-only lookup (existing VPC ID, AMI ID)
VariableInput per environment (CIDR, instance size)
OutputValues for other stacks or humans (cluster endpoint)
ModuleReusable package of resources (vpc module, eks module)

Step 4 — Minimal configuration example

# versions.tf
terraform {
  required_version = ">= 1.6.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

# variables.tf
variable "environment" {
  type        = string
  description = "staging or production"
}

# main.tf
resource "aws_s3_bucket" "logs" {
  bucket = "myapp-${var.environment}-logs"
  tags = {
    Environment = var.environment
    ManagedBy   = "terraform"
  }
}

# outputs.tf
output "logs_bucket_name" {
  value = aws_s3_bucket.logs.id
}

Step 5 — plan and apply (the daily loop)

terraform init      # download providers, configure backend
terraform fmt -check  # style in CI
terraform validate    # static checks
terraform plan -var="environment=staging"
# review: + create, ~ update, - destroy
terraform apply -var="environment=staging"

plan is your PR review artifact—paste output in the ticket or attach as CI comment. Never apply unreviewed changes to production.

  • + create something new
  • ~ update in place (may replace if forces new)
  • - destroy (watch for data loss)
  • -/+ destroy then create (new physical ID)

Step 6 — State: the tool’s memory

Terraform stores a mapping from resource address in code to real cloud ID (e.g. aws_s3_bucket.logsmyapp-staging-logs). That JSON is the state file.

State can contain sensitive values—treat remote buckets as confidential, enable encryption and least-privilege IAM.

Step 7 — Drift: when reality diverges

Drift happens when someone changes resources in the console or an outage replaces hardware. The next plan shows unexpected diffs.

terraform plan   # shows drift vs code
# fix options:
# 1) Update code to match intentional console change
# 2) apply to revert cloud to code
# 3) terraform import for resources adopted into management

Step 8 — How IaC connects to the rest of DevOps

  1. IaC creates VPC + EKS + RDS + IAM for GitHub OIDC.
  2. CI/CD builds images and deploys to that cluster (CI/CD track).
  3. Kubernetes runs workloads; platform may install ingress, metrics-server via Helm or add-ons module.
  4. Observability (next track) scrapes metrics and ships logs from those resources.

Splitting repos is common: infra-live for Terraform, app-api for service code, app-deploy for manifests—linked by outputs (cluster name, subnet IDs).

Step 9 — Environments without copy-paste

Same modules, different variables:

infra/
  modules/
    vpc/
    eks/
  environments/
    staging/
      main.tf      # module "vpc" { cidr = "10.1.0.0/16" ... }
    production/
      main.tf      # module "vpc" { cidr = "10.2.0.0/16" ... }

Each environment has its own remote state key (staging/terraform.tfstate) so blast radius stays isolated.

Step 10 — Anti-patterns

Anti-patternWhy it hurts
State file in gitMerge conflicts, leaked secrets, no locking
terraform.tfvars with secrets committedCredentials in history—use vault or CI secrets
One giant root moduleSlow plans, scary blast radius—use modules
Apply from laptops to prodNo audit trail—CI with approval gates only
Ignoring - destroys in planAccidental data loss on databases and buckets

Step 11 — What to learn next on this track

Interview phrase: “We treat infrastructure as declarative code in git; terraform plan is the PR review surface; remote state with locking is the source of truth for IDs; CI applies to staging automatically and production after approval—drift is detected on every plan.”

The one line to remember

IaC is version-controlled desired state for your cloud—plan before you touch prod, store state where the team can collaborate, and let CI apply what reviewers approved.