Capstone 03: Medical Image Segmentation
Project Goal
Build a complete medical image segmentation system:
- Synthetic CT/MRI-like images with circular lesions
- U-Net from scratch with skip connections
- Dice + BCE combined loss
- MLflow experiment tracking
- ONNX export for deployment
- Full evaluation: Dice, IoU, pixel accuracy
Why Medical Segmentation?
Medical image segmentation is a top hiring domain for CV engineers at companies like:
- NVIDIA: Clara AI healthcare platform
- GE Healthcare / Siemens Healthineers: automated diagnostic tools
- PathAI / Paige.AI: pathology analysis
- Google Health: DeepMind AlphaFold, diabetic retinopathy screening
Architecture: U-Net
Input (1, H, W)
│
▼
Encoder:
Conv(1→32)→BN→ReLU → skip1 (32, H, W)
MaxPool
Conv(32→64)→BN→ReLU → skip2 (64, H/2, W/2)
MaxPool
Conv(64→128)→BN→ReLU → skip3 (128, H/4, W/4)
MaxPool
Bottleneck:
Conv(128→256)→BN→ReLU
Decoder:
Upsample → Cat(skip3) → Conv(384→128)
Upsample → Cat(skip2) → Conv(192→64)
Upsample → Cat(skip1) → Conv(96→32)
Output:
Conv(32→1) → Sigmoid → mask (1, H, W)
Loss Function: Dice + BCE
$$L = \alpha \cdot L_{BCE} + (1 - \alpha) \cdot L_{Dice}$$ $$L_{Dice} = 1 - \frac{2 \sum p_i g_i}{\sum p_i + \sum g_i + \epsilon}$$
BCE handles class imbalance at pixel level. Dice directly optimizes the overlap metric you care about.
Metrics
- Dice Score: primary metric (closer to 1.0 = better)
- IoU (Jaccard): $\frac{|P \cap G|}{|P \cup G|}$
- Pixel Accuracy: fraction of correctly classified pixels