Image Registry & Distribution
A registry is not a filesystem—it is a content-addressed blob store with a tag index on top. Every docker push uploads immutable layers and manifests; every docker pull resolves a tag (or digest) to those blobs. Master registry architecture and you control supply chain, multi-arch deploys, and production image governance.
Registry architecture
Every OCI-compliant registry speaks the same HTTP API. Think of it as git for binaries—blobs are commits, manifests are branch pointers, and tags are human-readable aliases that can move.
OCI Distribution Spec
The Open Container Initiative Distribution Specification defines how registries store and serve container images. Docker Registry HTTP API V2 is the de facto implementation. containerd, Podman, BuildKit, and every cloud registry (ECR, GCR, ACR, GHCR) implement the same contract—your tooling works everywhere once you understand the primitives.
v2 API endpoints
| Endpoint | Method | Purpose |
|---|---|---|
| /v2/ | GET | API version check; returns {} if v2 supported |
| /v2/<name>/blobs/<digest> | GET / HEAD | Fetch layer or config blob by SHA256 digest |
| /v2/<name>/blobs/uploads/ | POST / PATCH / PUT | Initiate, stream, and finalize blob upload |
| /v2/<name>/manifests/<reference> | GET / PUT / DELETE | Read or publish manifest by tag or digest |
| /v2/<name>/tags/list | GET | List tags for a repository (paginated) |
Content-addressable blobs
Blobs are stored by SHA256 digest—the hash of the bytes, not the filename. Layer tarballs, image config JSON, and manifest JSON are all blobs. Identical content across repositories deduplicates at storage level. A push uploads only blobs the registry does not already have.
Manifest types
| Media type | Contents | Use case |
|---|---|---|
| application/vnd.docker.distribution.manifest.v2+json | Schema 2 manifest — layer digests + config digest | Single-arch images (legacy default) |
| application/vnd.oci.image.manifest.v1+json | OCI image manifest — same structure, OCI media types | Modern single-arch builds |
| application/vnd.docker.distribution.manifest.list.v2+json | Manifest list — per-platform child manifests | Multi-arch images (linux/amd64, linux/arm64) |
| application/vnd.oci.image.index.v1+json | OCI image index — equivalent to manifest list | Multi-arch with OCI media types |
Tags vs digests
| Aspect | Tag (:latest) | Digest (@sha256:…) |
|---|---|---|
| Mutability | Moves on every push to same tag | Immutable—always same bytes |
| Human-readable | Yes (v1.4.2, main) | No—64-char hex hash |
| Production deploys | Risky—tag can change under you | Reproducible, auditable |
| Kubernetes | Default in many examples | Pin in manifests for supply-chain safety |
# Inspect manifest and resolve digest
docker pull nginx:alpine
docker inspect --format '{{index .RepoDigests 0}}' nginx:alpine
# Pull by immutable digest (not tag)
docker pull nginx@sha256:abc123...
# Query registry API directly
curl -sI -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
https://registry-1.docker.io/v2/library/nginx/manifests/alpine
When you docker push, the client uploads blobs first (POST → PATCH chunks → PUT digest), then PUTs the manifest referencing those digests. The registry never assembles layers—it only stores and serves blobs. containerd on the pull side reconstructs the image locally.
"What's the difference between a tag and a digest?" — A tag is a mutable label pointing to a manifest. A digest is the SHA256 of the manifest itself—immutable. Production should deploy by digest; tags are for human workflow only.
Docker Hub
Docker Hub is the default public registry—where nginx resolves without a hostname. Official images are curated and scanned; free-tier rate limits shape how teams authenticate in CI.
Official images
Images under the library/ namespace (e.g. nginx, postgres, node) are maintained by Docker or upstream vendors. They follow best practices: minimal layers, regular CVE patches, documented Dockerfiles. Prefer official images over random user repos for base layers.
Rate limits (anonymous vs authenticated)
| Account type | Pull limit | Mitigation |
|---|---|---|
| Anonymous (by IP) | 100 pulls / 6 hours | docker login in CI; use mirror or private registry |
| Free authenticated | 200 pulls / 6 hours | Cache base images in ECR/GHCR pull-through cache |
| Pro / Team / Business | Higher or unlimited | Org-level credentials for shared CI runners |
Login, tag, push, pull workflow
# Authenticate (stores creds in ~/.docker/config.json)
docker login
# Build and tag with your Docker Hub username
docker build -t myuser/myapp:1.0.0 .
docker tag myuser/myapp:1.0.0 myuser/myapp:latest
# Push both tags
docker push myuser/myapp:1.0.0
docker push myuser/myapp:latest
# Pull on another machine
docker pull myuser/myapp:1.0.0
Multi-arch builds with buildx
A single tag can point to a manifest list containing amd64 and arm64 variants. docker buildx cross-compiles and pushes all platforms in one command. Kubernetes and Docker automatically select the correct arch at pull time.
# Create a buildx builder (once)
docker buildx create --name multiarch --use
docker buildx inspect --bootstrap
# Build and push multi-platform image
docker buildx build \
--platform linux/amd64,linux/arm64 \
-t myuser/myapp:1.0.0 \
--push .
# Verify manifest list
docker buildx imagetools inspect myuser/myapp:1.0.0
Using :latest in production — the tag is a moving target. A base image update can change behavior or introduce CVEs without a deploy event. Pin to digests or explicit version tags in Kubernetes manifests and Compose files.
In CI, use docker/login-action with a PAT or org token before any pull. Unauthenticated GitHub Actions runners hit Hub rate limits quickly when every job pulls node:20-alpine fresh.
Private registry options
Production teams rarely rely on Docker Hub alone. Private registries control access, retention, scanning, and geo-proximity. The OCI API is the same—choice comes down to ops model, cloud alignment, and features.
Comparison table
| Registry | Type | Strengths | Trade-offs |
|---|---|---|---|
| Distribution | Self-hosted OSS | Reference implementation; minimal deps; S3/GCS backend | No UI, scanning, or RBAC out of the box |
| Harbor | Self-hosted OSS | RBAC, replication, Trivy scanning, OIDC, project quotas | Kubernetes install complexity; ops overhead |
| Amazon ECR | AWS managed | IAM auth, lifecycle policies, Inspector scanning, pull-through cache | AWS-only; per-GB storage + transfer costs |
| GCR / Artifact Registry | GCP managed | GKE integration, vulnerability scanning, regional repos | GCP-only; migration from legacy GCR ongoing |
| Azure ACR | Azure managed | AKS attach, geo-replication, content trust, tasks | Azure-only; premium tier for advanced features |
| GHCR | GitHub managed | Free for public repos; tight Actions integration; OIDC push | GitHub-centric; org policy limits on free tier |
| Sonatype Nexus | Self-hosted / enterprise | Multi-format (npm, Maven, Docker); proxy upstream registries | Heavy JVM footprint; license for Pro features |
Self-hosted Distribution (minimal)
# Run local registry (dev only — no TLS/auth)
docker run -d -p 5000:5000 --name registry registry:2
# Push to insecure local registry (add to daemon.json insecure-registries)
docker tag myapp:latest localhost:5000/myapp:latest
docker push localhost:5000/myapp:latest
Architect decision framework
- Already on AWS/GCP/Azure? — Use the native registry; IAM integration eliminates credential sprawl.
- Multi-cloud or on-prem? — Harbor as a central hub with replication to cloud registries.
- GitHub-centric CI? — GHCR with OIDC eliminates long-lived registry passwords in secrets.
- Artifact diversity? — Nexus if you need Docker + Maven + npm in one proxy/cache layer.
Managed vs self-hosted: Cloud registries offload patching, HA, and storage scaling. Harbor wins when you need air-gapped deploys, cross-cloud replication, or unified policy across environments. Architects often use managed registries per cloud and Harbor as an on-prem mirror.
Many enterprises run pull-through caches (ECR, Harbor proxy, Nexus) in front of Docker Hub. CI and nodes pull from the internal mirror—faster, rate-limit-free, and auditable. Upstream Hub outages do not block deploys.
ECR deep dive
Amazon Elastic Container Registry is the default image store for EKS, ECS, and Lambda container workloads. Authentication is IAM-native—no long-lived passwords if you use OIDC or instance roles correctly.
Authentication with get-login-password
# One-time login (token valid 12 hours)
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin \
123456789012.dkr.ecr.us-east-1.amazonaws.com
# Create repository
aws ecr create-repository --repository-name myapp --region us-east-1
# Tag and push
docker tag myapp:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:latest
Lifecycle policies
ECR lifecycle rules automatically expire old images—critical for cost control when every CI run pushes a new tag. Rules filter by tag prefix, age, or count and delete matching manifests.
{
"rules": [
{
"rulePriority": 1,
"description": "Keep last 30 sha-tagged images",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["sha-"],
"countType": "imageCountMoreThan",
"countNumber": 30
},
"action": { "type": "expire" }
},
{
"rulePriority": 2,
"description": "Expire untagged images after 7 days",
"selection": {
"tagStatus": "untagged",
"countType": "sinceImagePushed",
"countUnit": "days",
"countNumber": 7
},
"action": { "type": "expire" }
}
]
}
Image scanning
ECR basic scanning (Clair-based) runs on push when enabled. ECR enhanced scanning uses Amazon Inspector for continuous CVE monitoring across OS and language packages. Findings surface in the console and via EventBridge for pipeline gates.
| Feature | Basic scanning | Enhanced scanning (Inspector) |
|---|---|---|
| Trigger | On push | On push + continuous rescan |
| Coverage | OS packages | OS + app deps (npm, pip, etc.) |
| Cost | Free | Per-image scan charges |
Cross-account access
Share images across AWS accounts with a repository policy granting ecr:BatchGetImage and ecr:GetDownloadUrlForLayer to a trusted account root or IAM role. EKS nodes in Account B pull from Account A's ECR without duplicating images.
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "AllowCrossAccountPull",
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::987654321098:root" },
"Action": [
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:BatchCheckLayerAvailability"
]
}]
}
Pull-through cache
ECR pull-through cache rules proxy upstream registries (Docker Hub, Quay, Kubernetes registry, another ECR). First pull fetches and caches upstream; subsequent pulls serve from ECR—reducing Hub rate limits and egress latency.
# Create pull-through rule for Docker Hub (console or CLI)
aws ecr create-pull-through-cache-rule \
--ecr-repository-prefix docker-hub \
--upstream-registry-url registry-1.docker.io
# Pull via cache prefix
docker pull 123456789012.dkr.ecr.us-east-1.amazonaws.com/docker-hub/library/nginx:alpine
VPC endpoints
Interface VPC endpoints for ecr.api and ecr.dkr keep image pulls off the public internet. Pair with an S3 gateway endpoint—layer blobs are stored in S3 behind ECR. Private EKS nodes pull images without NAT gateway egress costs.
Required endpoints for private-subnet EKS nodes: com.amazonaws.<region>.ecr.api, com.amazonaws.<region>.ecr.dkr, com.amazonaws.<region>.s3 (gateway), and com.amazonaws.<region>.sts for IAM role assumption.
In GitHub Actions, use aws-actions/configure-aws-credentials with OIDC instead of static AWS_ACCESS_KEY_ID. The role needs ecr:GetAuthorizationToken plus push/pull permissions on the target repository.
Image signing & provenance
Signing proves who built an image and that it wasn't tampered with after publish. SBOMs and attestations extend that to what's inside and how it was built—the foundation of supply-chain security.
Cosign (Sigstore)
Cosign signs OCI images with keyless OIDC (GitHub Actions, GitLab) or static keys. Signatures attach as separate artifacts in the registry—no manifest modification. Verification policies gate deploys in Kubernetes (Kyverno, policy-controller) and CI.
# Generate key pair (or use keyless in CI with OIDC)
cosign generate-key-pair
# Sign image after push
cosign sign --key cosign.key myregistry/myapp@sha256:abc123...
# Verify before deploy
cosign verify --key cosign.pub myregistry/myapp@sha256:abc123...
# Keyless signing in GitHub Actions (Fulcio + Rekor)
COSIGN_EXPERIMENTAL=1 cosign sign myregistry/myapp@${{ steps.meta.outputs.digest }}
SBOM generation (Syft) and scanning (Grype)
A Software Bill of Materials (SBOM) lists every package in an image. Syft generates SBOMs in SPDX or CycloneDX format; Grype scans SBOMs or images directly for CVEs. Attach SBOMs as OCI artifacts alongside the image for audit and policy enforcement.
# Generate SBOM from image
syft myapp:latest -o spdx-json > sbom.spdx.json
# Scan image for CVEs
grype myapp:latest --fail-on high
# Attach SBOM to registry (Cosign OCI artifact)
cosign attach sbom --sbom sbom.spdx.json myregistry/myapp:sha256-abc123...
Docker Content Trust (Notary v1)
Docker Content Trust (DCT) signs tags via Notary. Enable with export DOCKER_CONTENT_TRUST=1—unsigned images are rejected on pull. Largely superseded by Cosign in new projects, but still present in Docker Enterprise workflows and ACR content trust.
| Aspect | Docker Content Trust | Cosign |
|---|---|---|
| Signing target | Tags only | Digests and tags |
| Key management | Notary TUF keys (offline root) | Keyless OIDC or KMS |
| Ecosystem | Docker CLI native | Kubernetes policy, GitHub Actions, Sigstore |
| New projects | Legacy | Recommended default |
BuildKit attestations
BuildKit can embed SLSA provenance and SBOM attestations at build time. Attestations are stored as OCI artifacts linked to the image digest—verifiable proof of builder identity, source repo, and build parameters.
# Build with provenance and SBOM attestations
docker buildx build \
--provenance=true \
--sbom=true \
-t myregistry/myapp:1.0.0 \
--push .
# Inspect attestations
docker buildx imagetools inspect myregistry/myapp:1.0.0 --format '{{json .Attestations}}'
Signing without verification is theater. Enforce admission policies (Kyverno verifyImages, Sigstore policy-controller) so unsigned or unverified images cannot run in production clusters.
"How do you secure the container supply chain?" — Pin base images by digest, scan in CI (fail on critical CVEs), generate SBOMs, sign with Cosign, verify at admission, and store artifacts in a private registry with lifecycle policies.
Image scanning
Scanning finds known CVEs in OS packages and application dependencies before images reach production. Shift-left means failing the pipeline on critical findings—not discovering them in a running cluster.
Scanner comparison
| Tool | Type | Strengths | Best for |
|---|---|---|---|
| Trivy | Open source | Fast, broad coverage (OS, lang, IaC, secrets); easy CI integration | GitHub Actions, Harbor, general-purpose gates |
| Grype | Open source (Anchore) | Pairs with Syft SBOMs; consistent results from SBOM or image | SBOM-first pipelines, Cosign attach workflows |
| Docker Scout | Docker SaaS | Base image recommendations, delta analysis, Hub integration | Developer feedback loops, Docker Desktop users |
| Snyk Container | Commercial | Deep dependency graphs, fix PRs, policy dashboards | Enterprise policy, dev-friendly remediation |
Trivy — CLI and CI
# Scan local image
trivy image myapp:latest
# Fail CI on HIGH/CRITICAL CVEs
trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:latest
# Scan filesystem (Dockerfile context before build)
trivy fs --security-checks vuln,secret .
# JSON output for dashboards
trivy image -f json -o results.json myapp:latest
Grype — SBOM-aware scanning
# Scan directly
grype myapp:latest
# Scan from Syft SBOM (reproducible — same SBOM, same results)
syft myapp:latest -o cyclonedx-json | grype
# Gate on critical only
grype myapp:latest --fail-on critical
Docker Scout
docker scout quickview and docker scout compare show CVE counts and recommend smaller or fewer-vulnerable base images. Integrates with Docker Hub and Desktop—useful for developer education, less for hard CI gates.
Shift-left CI scanning pipeline
A mature pipeline scans at multiple stages:
- Pre-build — trivy fs on Dockerfile + context (secrets, misconfigs)
- Post-build — trivy image or grype on the built image
- Post-push — Registry-native scan (ECR Inspector, Harbor Trivy, Docker Scout) for continuous monitoring
- Pre-deploy — Admission policy rejects images over CVE threshold or without signatures
# GitHub Actions excerpt — build, scan, sign, push
- name: Build image
run: docker build -t ${{ env.IMAGE }}:${{ github.sha }} .
- name: Scan with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.IMAGE }}:${{ github.sha }}
severity: CRITICAL,HIGH
exit-code: '1'
- name: Sign with Cosign
run: cosign sign --yes ${{ env.IMAGE }}@${{ steps.digest.outputs.digest }}
Scanning only in production — registry scans run after push; fixing means rebuild and redeploy. Scan in CI before push to catch CVEs when the developer still has context. Accept that some base-image CVEs have no fix—track exceptions with expiry dates.
Zero-CVE policy vs velocity: Blocking on every MEDIUM CVE stalls teams. Tier policies: fail on CRITICAL, warn on HIGH, track MEDIUM with SLA. Distroless and minimal bases reduce noise but increase debug friction.