The COTBLEDTCID approach to object detection and pose estimation, Part II

Introduction

Images are usually too complex to be treated by computers as is. In most cases, they have to be enhanced and simplified before any algorithm can be applied on them. Fortunately for us, the rules of the contest specify that yellow is the colour of pawns and figures we are looking to detect.

Generally speaking, the purpose of Colour Segmentation is to extract information from an image by grouping similar colours. In our algorithm, we implemented colour segmentation by thresholding yellow colours, that is to say, the computer builds a black and white image from the original image where white colour represents yellow colour and black is everything else. This is called colour thresholding.

The problem is, as you may have already guessed, the notion of “yellow” colour. For humans, it is relatively easy to tell whether a colour belongs to a group of colours, but for computers this is a whole different story.

Colour spaces and thresholding

Computers work with a limited set of colours, which are usually represented by three additive components Red Green and Blue, RGB for short. All these colours can be represented inside a cube called the RGB Colour Space which, in computer vision, is very difficult to work with. Mainly because colours tend to change too much from a given point A to any other point near of it. There is no “easy” way of telling whether a point B belongs to a group of colours due to the fact this subset is not easily represented using cartesian coordinates. In other words, in this space it is very difficult to say whether a given colour C is pink, or yellow, or even orange.

For instance the following next 2 images represent, in RGB colour space, all colours which are present in the previous image. As you may see, it is impossible to find a simple shape (such as a smaller cube or a cuboid) that fully envelops all yellow colours.

This is one of the reasons why the HSV colour space exists. In this space, colours are represented and organized more like the human eye interprets them, in group of colours. HSV stands for Hue, Saturation an Value and is represented by a cylindrical coordinate system. In this space, many colours such as yellow, are much more easily thresholded because colour segments or subsets may be approximated by “pies” which are easily expressed in polar coordinate systems: $latex ( r1 < Radius < r2, theta1 < Angle < theta 2 ) $.

For the same image as above, the HSV transformation gives :

Look how colours may be segmented with different azimuth and radius values.

Nevertheless, as you might have noticed, a single pie is a bit too rough of an approximation of the yellow colour subset. That’s why we used up to three different “pies” at different levels to better threshold the image.

Algorithm and code

OpenCV, has done must of the hard work as it can transform colours and threshold them. The code is pretty straight forward :

// Convert the image into an HSV image
cvtColor(originalImage, hsvImage, CV_BGR2HSV);

// Threshold the hsv iamge
cv::Mat binaryImage;
cv::inRange( hsvImage, cv::Scalar(hueMin, satMin, valueMin),
                       cv::Scalar(hueMax, satMax, valueMax),
                       binaryImage);

The threshold was repeated up to 3 times with different min and max values to better approximate the yellow set. Unifying the image into a single B&W image was done with help of cv::bitwise_or.

Results

In short the algorithm transforms an image from RGB to HSV and then thresholds it several times and merges them into a single black and white image :

Input :

Output :

Discussion

Noise

The output binary image has a lot of noise and many objects that are not inside the game field still appear on the image.

We could have used a combination of dilate and erode filters to reduce the small noise (cf. Closing filter, Noise Reduction), but we decided to leave the image as is and do the filtering in a later step to increase performance.

The objects that are outside the game field aren’t a problem, as they are rejected later on.

Lightning

Another problem we faced was the high illumination variations which resulted in a change of tones and even colours themselves. Even though, the HSV colour space is more robust against lightning variations than the RGB colour space, it is not robust enough as well as many other factors such as lens and cameras' sensors affect them.

Our solution was to calibrate colours to find the best thresholding values before running the COTBLEDTCID algorithm.

You should better understand the problem with the following images : yellow colours are completely different from one image to the other, even for the human eye. When looking at their HSV representation, we can see a slight yellow HSV variation from the first image to the second one. This variation, even though it is small, is too much of a difference for both images to be sharing the same thresholding values.

NB : these 2 pictures were taken with different cameras. That’s why colour distribution is so different. But, hey. You got the point right ? ;)

Other approaches

Colour segmentation is a very well-known problem in computer vision. Many papers have been written on this topic. We could have certainly used a more robust algorithm to detect homogeneous colours such as the one described in this paper. This could have removed the need of constantly doing colour calibrations but would have certainly added a non negligible overhead when separating yellow colours from the rest. As one of our functional requirements was performance, a statical colour model was better suited for our needs.

What’s next

In this step the computer still does not know much about what we’re looking for, nor about the image it’s viewing. In the next post I’ll explain the algorithm that will help the computer understand a bit more about its environment.

About the colour space visuals

All renders were made with ColorSpace. Many thanks to Colantoni Philippe for such an useful software.

Rodrigo Castro

Table of Contents

The COTBLEDTCID approach to object detection and pose estimation, Part II - Colour Thresholding