HomeMLOps EngineerContainerization with Docker

Containerization with Docker

Master Docker for ML deployment: Dockerfiles, multi-stage builds, Docker Compose, GPU support, and image optimization techniques

📅 Tutorial 5 📊 Intermediate

🎓 Complete all tutorials to earn your Free MLOps Engineer Certificate
Shareable on LinkedIn • Verified by AITutorials.site • No signup fee

🐳 Why Containerization Matters

Your ML API works perfectly on your laptop. But when you deploy it to a server, everything breaks. Different Python version. Missing libraries. Incompatible CUDA drivers. Environment variables not set. It's the classic "works on my machine" nightmare.

Docker solves this problem by packaging your application, dependencies, and environment into a single, portable container that runs identically everywhere - on your laptop, production servers, or in the cloud.

⚠️ Problems Docker Solves:

  • "Works on my machine" - different environments cause failures
  • Dependency conflicts between projects
  • Complex setup instructions for team members
  • Inconsistent production environments
  • Difficult to scale and replicate services
  • No isolation between applications on same server

💡 Real-World Impact: Netflix runs 100,000+ containers. Uber migrated 4,000 microservices to containers. Google starts 2 billion containers per week. Docker is the foundation of modern ML deployment.

🏗️ Docker Fundamentals

Key Concepts

📦

Image

Read-only template with application code, dependencies, and OS. Like a snapshot or class definition.

🚀

Container

Running instance of an image. Isolated process with its own filesystem, network, and resources.

📝

Dockerfile

Text file with instructions to build a Docker image. Defines base image, dependencies, and commands.

🏪

Registry

Repository for storing and distributing images. Docker Hub is the public registry.

Installation

# macOS (using Homebrew)
brew install --cask docker

# Or download Docker Desktop from docker.com

# Verify installation
docker --version
docker run hello-world

# Check if Docker daemon is running
docker ps

Essential Docker Commands

# Build an image
docker build -t myapp:v1 .

# Run a container
docker run -p 8000:8000 myapp:v1

# List running containers
docker ps

# List all containers (including stopped)
docker ps -a

# List images
docker images

# Stop a container
docker stop container_id

# Remove container
docker rm container_id

# Remove image
docker rmi image_id

# View logs
docker logs container_id

# Execute command in running container
docker exec -it container_id bash

# View container resource usage
docker stats

📝 Your First ML Dockerfile

Simple FastAPI ML Service

# Dockerfile
# Use official Python runtime as base image
FROM python:3.10-slim

# Set working directory in container
WORKDIR /app

# Copy requirements file
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose port 8000
EXPOSE 8000

# Command to run the application
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Project Structure

ml-api/
├── Dockerfile
├── requirements.txt
├── app.py
└── models/
    └── model.joblib

requirements.txt

fastapi==0.104.1
uvicorn[standard]==0.24.0
scikit-learn==1.3.2
joblib==1.3.2
numpy==1.26.2
pydantic==2.5.0

Building and Running

# Build the image
docker build -t ml-api:v1 .

# Run the container
docker run -d \
  --name ml-api-container \
  -p 8000:8000 \
  ml-api:v1

# Test the API
curl http://localhost:8000/health

# View logs
docker logs ml-api-container

# Stop and remove
docker stop ml-api-container
docker rm ml-api-container

✅ What Just Happened: You packaged your ML API into a Docker image. Now anyone can run your exact environment with a single command - no Python installation, no pip install, no configuration needed!

🏗️ Multi-Stage Builds for Smaller Images

Multi-stage builds allow you to create leaner production images by separating build-time dependencies from runtime dependencies.

Problem: Large Image Sizes

# ❌ Simple Dockerfile - Results in 1.5GB+ image
FROM python:3.10

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

# Includes build tools, compilers, dev headers
# All unused in production!

Solution: Multi-Stage Build

# ✅ Multi-stage Dockerfile - Results in ~500MB image

# Stage 1: Build environment
FROM python:3.10 as builder

WORKDIR /app

# Install build dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Stage 2: Production environment
FROM python:3.10-slim

WORKDIR /app

# Copy only the installed packages from builder
COPY --from=builder /root/.local /root/.local

# Copy application code
COPY app.py .
COPY models/ models/

# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH

# Non-root user for security
RUN useradd -m -u 1000 appuser && \
    chown -R appuser:appuser /app
USER appuser

EXPOSE 8000

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Advanced: Compile NumPy/SciPy in Build Stage

# Multi-stage for ML with compiled dependencies

# Build stage
FROM python:3.10 as builder

# Install system dependencies for building
RUN apt-get update && apt-get install -y \
    gcc \
    g++ \
    gfortran \
    libopenblas-dev \
    liblapack-dev \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /build

COPY requirements.txt .

# Build wheels for expensive packages
RUN pip wheel --no-cache-dir --wheel-dir /wheels \
    numpy==1.26.2 \
    scipy==1.11.4 \
    scikit-learn==1.3.2

RUN pip wheel --no-cache-dir --wheel-dir /wheels \
    -r requirements.txt

# Runtime stage
FROM python:3.10-slim

# Install only runtime dependencies
RUN apt-get update && apt-get install -y \
    libopenblas0 \
    libgomp1 \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Copy pre-built wheels
COPY --from=builder /wheels /wheels

# Install from wheels (much faster!)
RUN pip install --no-cache-dir /wheels/*.whl && \
    rm -rf /wheels

# Copy application
COPY app.py .
COPY models/ models/

# Security: non-root user
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser

EXPOSE 8000

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Image Size Comparison

Approach Image Size Build Time
Simple (python:3.10) 1.8 GB 5 min
Slim base (python:3.10-slim) 800 MB 6 min
✅ Multi-stage build 500 MB 7 min (first), 2 min (cached)
Alpine-based (advanced) 300 MB 15 min (compilation)

🎼 Docker Compose for Multi-Container Apps

Real ML systems often need multiple services: API server, Redis cache, PostgreSQL database, monitoring tools. Docker Compose orchestrates them all.

Complete ML Stack with Docker Compose

# docker-compose.yml
version: '3.8'

services:
  # ML API Service
  api:
    build: 
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
    environment:
      - REDIS_URL=redis://redis:6379
      - POSTGRES_URL=postgresql://user:pass@postgres:5432/mlops
    depends_on:
      - redis
      - postgres
    volumes:
      - ./models:/app/models
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Redis for caching predictions
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    restart: unless-stopped

  # PostgreSQL for logging predictions
  postgres:
    image: postgres:15-alpine
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=mlops
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
    restart: unless-stopped

  # Prometheus for metrics
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
    restart: unless-stopped

  # Grafana for dashboards
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana-data:/var/lib/grafana
    depends_on:
      - prometheus
    restart: unless-stopped

volumes:
  redis-data:
  postgres-data:
  prometheus-data:
  grafana-data:

Enhanced API with Redis Caching

# app.py with Redis caching
from fastapi import FastAPI
import redis
import json
import hashlib

app = FastAPI()

# Connect to Redis
redis_client = redis.from_url("redis://redis:6379", decode_responses=True)

@app.post("/predict")
def predict_with_cache(request: PredictionRequest):
    # Create cache key from input
    input_str = json.dumps(request.dict(), sort_keys=True)
    cache_key = f"pred:{hashlib.md5(input_str.encode()).hexdigest()}"
    
    # Check cache
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Make prediction
    features = np.array([[
        request.sepal_length,
        request.sepal_width,
        request.petal_length,
        request.petal_width
    ]])
    
    prediction = model.predict(features)[0]
    probabilities = model.predict_proba(features)[0]
    
    result = {
        "prediction": int(prediction),
        "confidence": float(max(probabilities)),
        "cached": False
    }
    
    # Cache for 1 hour
    redis_client.setex(cache_key, 3600, json.dumps(result))
    
    return result

Docker Compose Commands

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f api

# Scale API instances
docker-compose up -d --scale api=3

# Stop all services
docker-compose down

# Stop and remove volumes (data)
docker-compose down -v

# Rebuild images
docker-compose up -d --build

# Check service status
docker-compose ps

💡 Benefits: With one command (docker-compose up), you start: API server, Redis cache, PostgreSQL database, Prometheus monitoring, and Grafana dashboards. Perfect for local development matching production!

🎮 GPU Support for Deep Learning

Running deep learning models in Docker requires GPU access for acceptable performance.

Prerequisites

# Install NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
    sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

# Test GPU access
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

Dockerfile for PyTorch with GPU

# GPU-enabled PyTorch Dockerfile
FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04

# Install Python
RUN apt-get update && apt-get install -y \
    python3.10 \
    python3-pip \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Install PyTorch with CUDA support
RUN pip3 install --no-cache-dir \
    torch==2.1.0 \
    torchvision==0.16.0 \
    --index-url https://download.pytorch.org/whl/cu118

# Install other dependencies
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt

COPY . .

# Test GPU on startup
RUN python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

EXPOSE 8000

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Running with GPU

# Run container with GPU access
docker run -d \
  --name ml-gpu \
  --gpus all \
  -p 8000:8000 \
  ml-api-gpu:v1

# Specify specific GPU
docker run -d \
  --gpus '"device=0"' \
  ml-api-gpu:v1

# Limit GPU memory
docker run -d \
  --gpus all \
  --memory=8g \
  ml-api-gpu:v1

# Check GPU usage
docker exec ml-gpu nvidia-smi

Docker Compose with GPU

# docker-compose.yml with GPU
version: '3.8'

services:
  ml-gpu-api:
    build: .
    ports:
      - "8000:8000"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

⚠️ GPU Image Sizes: CUDA images are large (5-10GB). Use cudnn-runtime instead of cudnn-devel (includes compilers) to save ~3GB. Only include CUDA if you actually need GPU inference.

⚡ Image Optimization Strategies

1. Layer Caching

# ❌ Poor caching - requirements change = rebuild everything
FROM python:3.10-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt

# ✅ Good caching - code changes don't rebuild deps
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

2. Minimize Layers

# ❌ Many layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y wget
RUN apt-get clean

# ✅ Single layer
RUN apt-get update && apt-get install -y \
    curl \
    wget \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

3. .dockerignore File

# .dockerignore - exclude from image
__pycache__/
*.pyc
*.pyo
*.pyd
.git/
.gitignore
.vscode/
.idea/
*.md
tests/
docs/
.pytest_cache/
*.log
.env
venv/
.DS_Store
notebooks/
*.ipynb

4. Use Specific Base Images

Base Image Size Use Case
python:3.10 920 MB Development, includes build tools
✅ python:3.10-slim 130 MB Production, minimal packages
python:3.10-alpine 50 MB Smallest, compilation issues common
nvidia/cuda:11.8-runtime 1.5 GB GPU inference only
nvidia/cuda:11.8-devel 4.5 GB GPU with compilation tools

5. Remove Package Managers Cache

RUN pip install --no-cache-dir -r requirements.txt

RUN apt-get update && apt-get install -y package \
    && rm -rf /var/lib/apt/lists/*

6. Optimize Model Files

# Compress models before adding to image
import joblib

# Standard save (large)
joblib.dump(model, 'model.joblib')

# Compressed save (smaller)
joblib.dump(model, 'model.joblib', compress=3)

# Or use pickle protocol 5 for large arrays
joblib.dump(model, 'model.joblib', protocol=5)

Image Size Audit

# Analyze image layers
docker history ml-api:v1

# See which layers are largest
docker history ml-api:v1 --human --no-trunc

# Use dive for interactive analysis
brew install dive
dive ml-api:v1

✅ Docker Best Practices for ML

1. Security

# Run as non-root user
RUN useradd -m -u 1000 appuser && \
    chown -R appuser:appuser /app
USER appuser

# Don't include secrets
# ❌ Never: COPY .env .
# ✅ Use: docker run -e API_KEY=$API_KEY ...

# Scan for vulnerabilities
# docker scan ml-api:v1

2. Health Checks

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

3. Logging

# Log to stdout (Docker captures it)
import logging
import sys

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[logging.StreamHandler(sys.stdout)]
)

logger = logging.getLogger(__name__)

4. Environment Variables

# Set defaults
ENV MODEL_PATH=/app/models/model.joblib \
    LOG_LEVEL=INFO \
    WORKERS=4

# Override at runtime
# docker run -e LOG_LEVEL=DEBUG ml-api:v1

5. Model Versioning

# Tag images with model version
docker build -t ml-api:v1.0-model-rf-2023-12 .

# Use semantic versioning
docker build -t ml-api:1.0.0 .
docker tag ml-api:1.0.0 ml-api:latest

# Include metadata
docker build \
  --label "model.version=1.0" \
  --label "model.type=random-forest" \
  --label "training.date=2023-12-10" \
  -t ml-api:v1 .

6. Volume Mounts for Development

# Mount code for live reloading (dev only)
docker run -d \
  -v $(pwd):/app \
  -p 8000:8000 \
  ml-api:v1

# Mount model directory (swap models without rebuild)
docker run -d \
  -v $(pwd)/models:/app/models:ro \
  -p 8000:8000 \
  ml-api:v1

7. CI/CD Integration

# .github/workflows/docker-build.yml
name: Build and Push Docker Image

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}
      
      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: |
            myorg/ml-api:latest
            myorg/ml-api:${{ github.sha }}
          cache-from: type=registry,ref=myorg/ml-api:latest
          cache-to: type=inline

🎯 Summary

You've mastered Docker for ML deployment:

🐳

Docker Basics

Images, containers, Dockerfiles, and essential commands

🏗️

Multi-Stage Builds

Reduce image sizes from 1.8GB to 500MB

🎼

Docker Compose

Orchestrate multi-service ML stacks

🎮

GPU Support

Run deep learning models with CUDA

Optimization

Layer caching, minimal images, security

Best Practices

Production-ready containerization

Key Takeaways

  1. Docker solves "works on my machine" by packaging everything
  2. Use multi-stage builds to minimize image size
  3. Docker Compose orchestrates complex ML stacks
  4. GPU support requires nvidia-container-toolkit
  5. Optimize with layer caching and .dockerignore
  6. Always run as non-root user for security
  7. Tag images with model versions for traceability

🚀 Next Steps:

Your ML service is containerized! Next, you'll learn cloud deployment - taking your Docker containers to AWS, Google Cloud, and Azure for production-scale serving.

Test Your Knowledge

Q1: What's the main benefit of Docker for ML deployment?

It makes models more accurate
It packages the application and all dependencies into a portable container that runs identically everywhere
It's faster than running code directly
It's required by law

Q2: What's the purpose of multi-stage builds?

To run multiple models
To support multiple programming languages
To separate build dependencies from runtime, creating smaller production images
To enable parallel processing

Q3: Why should you copy requirements.txt before copying application code?

To leverage Docker layer caching - dependency installation won't re-run when only code changes
It's required by Docker
To make builds faster always
To prevent security issues

Q4: What does Docker Compose help with?

Writing music
Compressing images
Composing emails
Orchestrating multi-container applications (API, database, cache, monitoring)

Q5: For GPU support in Docker, what do you need?

Just Docker installed
NVIDIA Container Toolkit and --gpus flag when running containers
A special Docker license
Only PyTorch