Cloud & Infrastructure

Container Orchestration

Advanced

Once you run containers at scale, something has to schedule them, restart failed ones, scale them with load, roll out new versions, and set up networking and secrets. That is orchestration (Kubernetes, Azure Container Apps, and similar tools). It is powerful and complex. The goal is to use its safety features (health, limits, secrets, least privilege) and not let its flexibility make you insecure.

Orchestrators turn a single container into a resilient, scalable service. They keep the wanted number of healthy instances running, replace unhealthy ones, roll out updates gradually, and manage configuration, secrets, and networking. The trade-off is a large surface area with many defaults that are not secure. So the discipline is to configure the safety nets on purpose and apply least privilege to workloads.

This builds on Containers & Images (the artifact), Infrastructure as Code (define it in code), and Designing for Failure and Cost & Scale (resilience and bounded scaling). Prefer the simplest platform that meets the need. Choose managed options over self-run clusters where you can.

Run workloads resiliently

DoDefine health, readiness, and liveness probes so the orchestrator only routes to healthy instances and restarts broken ones (see Designing for Failure).
DoSet resource requests and limits (CPU and memory) and autoscaling with bounds, so workloads get what they need and a runaway one cannot starve others or blow up cost (see Cost & Scale Planning).
DoUse rolling or canary deployments with health gates and a fast rollback, and run enough replicas across zones for availability (see CI/CD & Deployment).
DoDefine everything as code (manifests or IaC), versioned and deployed through the pipeline. Do not run kubectl by hand against production (see Infrastructure as Code).
NeverMake production cluster changes by hand outside IaC and CI/CD. It causes drift and bypasses review (see Infrastructure as Code).

Secure the platform

DoRun workloads with least privilege: non-root containers, minimal capabilities, read-only filesystems, and scoped service identities (workload or managed identity), not broad cluster rights (see Managed Identity & Least-Privilege).
DoInject secrets from the vault or platform secret store at runtime. Never bake them into images or manifests (see Secrets Management).
DoSegment the network: default-deny between workloads (network policies), restrict ingress and egress, and keep the control plane and nodes private (see Network & Resource Isolation).
DoKeep the platform and base images patched, and scan workloads. Clusters and images build up CVEs over time (see Vulnerability Management, Containers & Images).
ConsiderThe simplest orchestration that fits (managed container platforms over a self-run cluster). That means less surface to secure and operate.
NeverRun containers as root with broad cluster permissions or host access they don't need.

Self-review checklist

AskDo workloads have health probes, resource limits, and bounded autoscaling?
AskAre they running least-privilege (non-root, scoped identity), with secrets injected not baked in?
AskIs the network default-deny between workloads, with a private control plane?
AskIs all of this defined in code and deployed through the pipeline, not by hand?

Why it matters: Orchestration gives us resilience and scale, but its complexity and permissive defaults make it a large attack and failure surface. Over-privileged workloads, baked-in secrets, flat networks, and hand-made changes are common, serious mistakes. Using its safety features and applying least privilege turns the platform into a dependable, secure foundation rather than a large risk.