হোম/Roadmap/Chapter 8.03
Phase 8 · Chapter 8.03

Blue-Green Deployment

Blue = বর্তমান live। Green = নতুন version, ready কিন্তু idle। Router এক click-এ switch — rollback-ও তেমনি instant।

Concept

ছবিতে

textproduction
            ┌─── Blue (v1)  ◄─── 100% traffic
  Router ───┤
            └─── Green (v2)      0%     ← warm but idle

  Deploy v2 → smoke test on Green → flip router → done
  Rollback   → flip router back to Blue (seconds)
Canary vs Blue-Green

কখন কোনটা

  • Canary: traffic ধীরে বাড়াও, real user-এ gradual validation।
  • Blue-Green: 0% → 100% instant flip, partial state-এ user নেই।
  • Schema migration / stateful service-এ blue-green নিরাপদ। Stateless ML inference-এ canary ভালো।
k8s Service Switch

Selector flip দিয়ে

yamlproduction
# Blue deployment (live)
kind: Deployment
metadata: { name: iris-blue }
spec:
  template:
    metadata:
      labels: { app: iris, slot: blue }
    spec:
      containers: [{ name: api, image: ghcr.io/me/iris:v1 }]

---
# Green deployment (new)
kind: Deployment
metadata: { name: iris-green }
spec:
  template:
    metadata:
      labels: { app: iris, slot: green }
    spec:
      containers: [{ name: api, image: ghcr.io/me/iris:v2 }]

---
# Service points to one slot at a time
kind: Service
metadata: { name: iris-svc }
spec:
  selector: { app: iris, slot: blue }   # ← flip to "green" to cutover
  ports: [{ port: 80, targetPort: 8000 }]
bashproduction
# cutover
kubectl patch svc iris-svc \
  -p '{"spec":{"selector":{"app":"iris","slot":"green"}}}'

# rollback in seconds
kubectl patch svc iris-svc \
  -p '{"spec":{"selector":{"app":"iris","slot":"blue"}}}'
Argo Rollouts BG

Built-in BlueGreen strategy

yamlproduction
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata: { name: iris-api }
spec:
  replicas: 5
  strategy:
    blueGreen:
      activeService: iris-active
      previewService: iris-preview
      autoPromotionEnabled: false      # manual gate
      scaleDownDelaySeconds: 600       # keep old for 10m rollback
      prePromotionAnalysis:
        templates: [{ templateName: smoke-tests }]
      postPromotionAnalysis:
        templates: [{ templateName: success-rate }]
  selector: { matchLabels: { app: iris } }
  template:
    spec:
      containers: [{ name: api, image: ghcr.io/me/iris:v2 }]

iris-preview service দিয়ে team v2 internally test করতে পারে, তারপর promote button।

Pros & Cons

Trade-off

  • Pros: Instant rollback, atomic cutover, easy mental model।
  • Cons: 2x infra during cutover, schema/data migration tricky, GPU pool double।
Data Compatibility

দুই side coexist করতে হয়

  • Backward-compatible schema: v2-এর column add করো, drop পরে।
  • Feature store version: v1 + v2 দুই feature simultaneously serve।
  • Async queue: consumer দুটোই handle করতে পারে যেন।
Pitfalls

যা ভাঙে

  • Green-এ warm-up না করা — switch করতেই cold-start spike।
  • Long-lived connection (WebSocket) — drain logic চাই।
  • DB migration irreversible — rollback ভাঙে।
  • Smoke test trivial — bug production-এ যায়।
Mini Project

Manual blue-green

  1. Iris-blue + iris-green deployment apply করো।
  2. Service selector blue রাখো, traffic verify।
  3. Green-এ port-forward করে smoke test।
  4. kubectl patch দিয়ে cutover, error rate দেখো।
  5. Rollback practice করো।
Takeaway

মনে রাখো

Blue-green = simple + instant + expensive। Stateful workload-এ first choice।