হোম/Roadmap/Chapter 2.02
Phase 2 · Chapter 2.02

Dockerizing ML Models

Phase 1-এর FastAPI iris service কে এবার container-এ pack করি — production-grade pattern সহ।

Hook

Model + API + Container = Deployable Unit

এতদিন model আলাদা, API আলাদা, dependency আলাদা ছিল। এখন তিনটিকে এক container-এ pack করব — যেকোনো cloud, যেকোনো OS, একই behavior।

Concept

ML container-এর 5 principle

  • Slim basepython:3.11-slim বা distroless।
  • Multi-stage — build deps আলাদা, runtime ছোট।
  • Pinned versionsrequirements.txt-এ exact version।
  • Non-root user — security hardening।
  • Healthcheck — orchestrator-এর জন্য signal।
Step-by-step

Project layout

bashproduction
iris-service/
├── app/
│   ├── main.py
│   ├── schemas.py
│   └── services/predictor.py
├── model.joblib
├── requirements.txt
├── .dockerignore
└── Dockerfile
textproduction
# requirements.txt
fastapi==0.115.0
uvicorn[standard]==0.30.6
scikit-learn==1.5.2
joblib==1.4.2
pydantic==2.9.2
textproduction
# .dockerignore
__pycache__
*.pyc
.venv
.git
tests
notebooks
*.ipynb
Code

Production-grade multi-stage Dockerfile

dockerfileproduction
# ---- Stage 1: builder ----
FROM python:3.11-slim AS builder

WORKDIR /build
RUN pip install --upgrade pip wheel
COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /wheels -r requirements.txt

# ---- Stage 2: runtime ----
FROM python:3.11-slim AS runtime

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PIP_NO_CACHE_DIR=1

RUN useradd --create-home --uid 1000 mlops
WORKDIR /app

COPY --from=builder /wheels /wheels
COPY requirements.txt .
RUN pip install --no-index --find-links=/wheels -r requirements.txt \
 && rm -rf /wheels

COPY app/ ./app/
COPY model.joblib ./model.joblib

USER mlops
EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD python -c "import urllib.request,sys; \
  sys.exit(0 if urllib.request.urlopen('http://localhost:8000/health').status==200 else 1)"

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]

ব্যাখ্যা: Builder stage-এ wheels তৈরি, runtime stage-এ শুধু সেগুলো install — final image ছোট ও clean। Non-root mlops user + healthcheck production-ready।

Build & run

Commands

bashproduction
# Build
docker build -t iris-api:1.0 .

# Inspect size
docker images iris-api

# Run
docker run -d --name iris -p 8000:8000 iris-api:1.0

# Test
curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"sepal_length":5.1,"sepal_width":3.5,"petal_length":1.4,"petal_width":0.2}'

# Health
docker inspect --format='{{.State.Health.Status}}' iris
Intuition

Image size কেন matter করে

  • Small image = fast pull = fast autoscale।
  • Less surface area = less CVE।
  • Registry cost ও bandwidth saving।
Real-world

Industry pattern

  • Hugging Face — Inference Endpoints container-based।
  • Seldon Core / BentoML — ML-specific container builder।
  • NVIDIA Triton — GPU inference container standard।
Common Mistakes

যেসব ভুল বেশি হয়

  • Model file image-এ না দিয়ে runtime-এ download — cold start slow।
  • Build context বিশাল — .dockerignore ভুলে যাওয়া।
  • Single worker container — CPU underutilized।
  • Image-এ Jupyter, pandas, matplotlib — সব dependency ঢুকিয়ে দেওয়া।
Practice Tasks

অনুশীলন

  1. উপরের Dockerfile-এ build করে image size measure করুন।
  2. Single-stage এবং multi-stage build-এর size compare করুন।
  3. trivy দিয়ে image vulnerability scan করুন।
  4. Healthcheck fail হলে কীভাবে behave করে test করুন।
Mini Project

Mini Project — Production Iris Container

Multi-stage Dockerfile সহ iris service container build করুন, GHCR-এ push করুন, এবং image size + vulnerability report README-এ attach করুন।

Summary

এই chapter থেকে যা শিখলাম

  • Multi-stage build = ছোট ও clean runtime image।
  • Non-root user + healthcheck = production-grade hygiene।
  • Pinned versions + .dockerignore reproducibility নিশ্চিত করে।