
computer vision
Generally speaking the phrase Computer Vision refers to tasks involving the automatic identification of the contents of an image or video.
Object detection is a common Computer Vision task involving the automatic identification of a specific object in a set of images or videos. Popular object detection applications include the detection of faces in images for organizational purposes and camera focusing, pedestrians for autonomous driving vehicles, and faulty components for automated quality control in electronics production.
Here’s how - at a high level - the object detection process works using human faces as our object of interest.
First, a set of labeled training images are produced. These training images have the objects of interest - in our case faces - marked by human labelers. These images are then used to train a machine learning algorithm to identify our desired object in future, unmarked images.
Next, the AI training consists of learning a ‘classifier’ over our set of images. A classifier is a geometric learning algorithm that - in its simplest form - learns to distinguish between patches of an image with and without a face in it via a linear decision boundary (as illustrated below).
Figure 1.
Figure taken from our book - Machine Learning Refined.

The classifier learns the best parameters of this boundary so that image patches containing faces lie on one side of the boundary, and all other patches lie on the other side.
To determine if any faces are present in an input image (in this instance an image of the Wright brothers, inventors of the airplane, sitting together in one of their first motorized flying machines in 1908) a small window is scanned across its entirety.
The content inside the box at each instance is determined to be a face by checking which side of the learned ‘classifier’ the feature representation of the content lies. In the figurative illustration shown here the area above and below the learned classifier (shown in blue and red) are the face and non-face sides of the classifier, respectively.