Capstone 03: Medical Image Segmentation

Project Goal

Build a complete medical image segmentation system:

  • Synthetic CT/MRI-like images with circular lesions
  • U-Net from scratch with skip connections
  • Dice + BCE combined loss
  • MLflow experiment tracking
  • ONNX export for deployment
  • Full evaluation: Dice, IoU, pixel accuracy

Why Medical Segmentation?

Medical image segmentation is a top hiring domain for CV engineers at companies like:

  • NVIDIA: Clara AI healthcare platform
  • GE Healthcare / Siemens Healthineers: automated diagnostic tools
  • PathAI / Paige.AI: pathology analysis
  • Google Health: DeepMind AlphaFold, diabetic retinopathy screening

Architecture: U-Net

Input (1, H, W)
    │
    ▼
Encoder:
  Conv(1→32)→BN→ReLU  → skip1 (32, H, W)
  MaxPool
  Conv(32→64)→BN→ReLU → skip2 (64, H/2, W/2)
  MaxPool
  Conv(64→128)→BN→ReLU → skip3 (128, H/4, W/4)
  MaxPool
  
Bottleneck:
  Conv(128→256)→BN→ReLU

Decoder:
  Upsample → Cat(skip3) → Conv(384→128)
  Upsample → Cat(skip2) → Conv(192→64)
  Upsample → Cat(skip1) → Conv(96→32)
  
Output:
  Conv(32→1) → Sigmoid → mask (1, H, W)

Loss Function: Dice + BCE

$$L = \alpha \cdot L_{BCE} + (1 - \alpha) \cdot L_{Dice}$$ $$L_{Dice} = 1 - \frac{2 \sum p_i g_i}{\sum p_i + \sum g_i + \epsilon}$$

BCE handles class imbalance at pixel level. Dice directly optimizes the overlap metric you care about.

Metrics

  • Dice Score: primary metric (closer to 1.0 = better)
  • IoU (Jaccard): $\frac{|P \cap G|}{|P \cup G|}$
  • Pixel Accuracy: fraction of correctly classified pixels