Phase 1 · Chapter 1.03
FastAPI for AI Services
Type-safe, async, blazing fast — কেন FastAPI আধুনিক AI service-এর default choice।
Hook
Speed + Type-safety একসাথে
FastAPI Starlette ও Pydantic-এর উপর তৈরি — Node.js-এর মতো async, অথচ Python ecosystem। Automatic OpenAPI docs, request validation এবং async I/O — ML service-এর ideal foundation।
Concept
FastAPI-র মূল features
- Pydantic schema — request/response validation।
- async/await — non-blocking I/O।
- Dependency Injection — model, DB session share করা সহজ।
- Auto docs —
/docs-এ Swagger UI। - Background tasks — post-response work।
Step-by-step
Production-grade FastAPI structure
bashproduction
app/
├── main.py # FastAPI app
├── schemas.py # Pydantic models
├── services/
│ └── predictor.py # Model service
├── deps.py # Dependencies
└── routers/
└── predict.py # Prediction endpointsCode
Typed FastAPI ML service
pythonproduction
# app/schemas.py
from pydantic import BaseModel, Field
from typing import List
class IrisInput(BaseModel):
sepal_length: float = Field(..., ge=0, le=10)
sepal_width: float = Field(..., ge=0, le=10)
petal_length: float = Field(..., ge=0, le=10)
petal_width: float = Field(..., ge=0, le=10)
class PredictionOut(BaseModel):
class_id: int
label: str
model_version: str
class BatchInput(BaseModel):
items: List[IrisInput] pythonproduction
# app/services/predictor.py
import joblib
LABELS = ["setosa", "versicolor", "virginica"]
class Predictor:
def __init__(self, path: str) -> None:
self.model = joblib.load(path)
self.version = "v1.0.0"
def predict(self, x: list[float]) -> tuple[int, str]:
cls = int(self.model.predict([x])[0])
return cls, LABELS[cls] pythonproduction
# app/main.py
from fastapi import FastAPI, Depends, HTTPException
from app.schemas import IrisInput, PredictionOut, BatchInput
from app.services.predictor import Predictor
app = FastAPI(title="Iris AI Service", version="1.0.0")
predictor = Predictor("model.joblib")
def get_predictor() -> Predictor:
return predictor
@app.get("/health")
async def health() -> dict:
return {"status": "ok"}
@app.post("/predict", response_model=PredictionOut)
async def predict(payload: IrisInput, p: Predictor = Depends(get_predictor)):
x = [payload.sepal_length, payload.sepal_width, payload.petal_length, payload.petal_width]
cls, label = p.predict(x)
return PredictionOut(class_id=cls, label=label, model_version=p.version)
@app.post("/predict/batch")
async def predict_batch(payload: BatchInput, p: Predictor = Depends(get_predictor)):
out = []
for item in payload.items:
x = [item.sepal_length, item.sepal_width, item.petal_length, item.petal_width]
cls, label = p.predict(x)
out.append({"class_id": cls, "label": label})
return {"predictions": out, "model_version": p.version} bashproduction
# Run
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4
# Docs: http://localhost:8000/docsব্যাখ্যা: Pydantic schema validation, dependency injection দিয়ে predictor share, এবং auto-generated Swagger — এই pattern production-ready।
Intuition
কেন async ML-এ গুরুত্বপূর্ণ
ML inference CPU-bound হলেও pre/post-processing প্রায়ই I/O-bound (DB, cache, external API)। Async হলে একই worker অনেক request একসাথে handle করতে পারে — throughput অনেক গুণ বাড়ে।
Real-world
Industry usage
- Netflix — internal ML service।
- Uber — real-time pricing API।
- Microsoft — many Azure AI service backend।
Common Mistakes
যেসব ভুল বেশি হয়
- Heavy CPU model কে
async def-এ চালানো — event loop block। - প্রতিটি request-এ model reload — RAM ও latency দুটোই বাড়ে।
- Pydantic schema-তে validation না দেওয়া।
- Single worker production-এ চালানো — uvicorn-এ
--workersদরকার।
Practice Tasks
অনুশীলন
- উপরের structure replicate করে চালান এবং
/docsদেখুন। - Background task যোগ করুন: prediction log file-এ append করা।
- Model load করুন
lifespanevent-এ। - CPU-bound prediction
run_in_executorদিয়ে চালান।
Mini Project
Mini Project — Typed FastAPI Iris Service
উপরের code কে full project আকারে বানান: routers, schemas, services separate। uvicorn দিয়ে 4 worker-এ চালিয়েlocust বা wrk দিয়ে load test করুন। Throughput report করুন।
Summary
এই chapter থেকে যা শিখলাম
- FastAPI = type-safety + async + auto-docs।
- Pydantic schema validation দিয়ে contract enforce।
- Dependency injection দিয়ে clean architecture।
- Production-এ multiple workers + lifespan event।