Capstone 03: Medical Image Segmentation

Project Goal

Build a complete medical image segmentation system:

Synthetic CT/MRI-like images with circular lesions
U-Net from scratch with skip connections
Dice + BCE combined loss
MLflow experiment tracking
ONNX export for deployment
Full evaluation: Dice, IoU, pixel accuracy

Why Medical Segmentation?

Medical image segmentation is a top hiring domain for CV engineers at companies like:

NVIDIA: Clara AI healthcare platform
GE Healthcare / Siemens Healthineers: automated diagnostic tools
PathAI / Paige.AI: pathology analysis
Google Health: DeepMind AlphaFold, diabetic retinopathy screening

Architecture: U-Net

Input (1, H, W)
    │
    ▼
Encoder:
  Conv(1→32)→BN→ReLU  → skip1 (32, H, W)
  MaxPool
  Conv(32→64)→BN→ReLU → skip2 (64, H/2, W/2)
  MaxPool
  Conv(64→128)→BN→ReLU → skip3 (128, H/4, W/4)
  MaxPool
  
Bottleneck:
  Conv(128→256)→BN→ReLU

Decoder:
  Upsample → Cat(skip3) → Conv(384→128)
  Upsample → Cat(skip2) → Conv(192→64)
  Upsample → Cat(skip1) → Conv(96→32)
  
Output:
  Conv(32→1) → Sigmoid → mask (1, H, W)

Loss Function: Dice + BCE

$$L = \alpha \cdot L_{BCE} + (1 - \alpha) \cdot L_{Dice}$$ $$L_{Dice} = 1 - \frac{2 \sum p_i g_i}{\sum p_i + \sum g_i + \epsilon}$$

BCE handles class imbalance at pixel level. Dice directly optimizes the overlap metric you care about.

Metrics

Dice Score: primary metric (closer to 1.0 = better)
IoU (Jaccard): $\frac{|P \cap G|}{|P \cup G|}$
Pixel Accuracy: fraction of correctly classified pixels