Phase 1 — Classical Computer Vision with OpenCV

Duration: 3 weeks | Prerequisite: Phase 0 complete


Why Classical CV Still Matters

Deep learning hasn't replaced classical computer vision — it runs alongside it. Production systems use classical algorithms for:

  • Pre/post-processing: Gaussian blur before edge detection, morphological ops to clean segmentation masks, NMS to deduplicate detection outputs
  • Real-time constraints: Harris corners and ORB run in microseconds; a full neural net cannot
  • Geometric reasoning: Camera calibration, stereo vision, homography estimation are inherently geometric — you can't just "throw a neural net at them"
  • Interpretability: When a classical algorithm fails, you can inspect every intermediate step

Every CV engineer is expected to understand these primitives deeply.


OpenCV Architecture

OpenCV (Open Source Computer Vision Library) is a C++ library with Python bindings. Key architectural points:

  • Default color order: BGR (not RGB) — OpenCV was written when cameras used BGR. This bites everyone eventually.
  • Images are NumPy arrays in Python: cv2.imread() returns np.ndarray — no special types.
  • In-place vs copy: Many functions have dst parameter. When None, a new array is allocated.
  • Data types matter: Many functions expect uint8 (0–255); filters need float32 or float64; always check img.dtype.

Labs

LabTopicKey APIs
lab-01-image-basicsColor spaces, histograms, pixel opscv2.imread, cvtColor, calcHist, equalizeHist
lab-02-filtering-morphologySpatial filtering, edge detectionGaussianBlur, Canny, morphologyEx, findContours
lab-03-feature-detectionKeypoints, descriptors, matchingSIFT, ORB, BFMatcher, findHomography
lab-04-optical-flow-trackingMotion estimation, object trackingcalcOpticalFlowPyrLK, calcOpticalFlowFarneback, TrackerCSRT
lab-05-camera-calibrationCamera geometry, calibrationcalibrateCamera, undistort, solvePnP

Learning Outcomes

  • Read, write, and display images correctly (avoiding the BGR/RGB trap)
  • Implement a full image processing pipeline: load → preprocess → detect → filter → output
  • Match keypoints between images and estimate a homography transformation
  • Track an object across video frames using both classical and optical flow methods
  • Calibrate a camera using a chessboard pattern and undistort images

Interview Relevance

  • "What is the difference between Gaussian blur and median filter? When would you use each?"
  • "Explain how SIFT achieves scale and rotation invariance."
  • "Walk me through how Canny edge detection works, step by step."
  • "What is a homography? What are its degrees of freedom?"
  • "How does camera calibration work? What is the intrinsic matrix?"
  • "What are the limitations of classical optical flow?"