SALE! Use code blackfriday for 40% off everything!
Hurry, sale ends soon! Click to see the full catalog.

Support Vector Machines in OpenCV

The Support Vector Machine algorithm is one of the most popular supervised machine learning techniques, and it comes implemented in the OpenCV library.

This tutorial will introduce the necessary skills to start using Support Vector Machines in OpenCV, using a custom dataset that we will generate. We will then apply these skills for the specific applications of image classification and detection in a subsequent tutorial. 

In this tutorial, you will learn how to apply OpenCV’s Support Vector Machine algorithm on a custom two-dimensional dataset. 

After completing this tutorial, you will know:

  • Several of the most important characteristics of Support Vector Machines.
  • How to use the Support Vector Machine algorithm on a custom dataset in OpenCV.

Let’s get started. 

Support Vector Machines in OpenCV
Photo by Lance Asper, some rights reserved.

Tutorial Overview

This tutorial is divided into two parts; they are:

  • Reminder of How Support Vector Machines Work
  • Discovering the SVM Algorithm in OpenCV

Reminder of How Support Vector Machines Work

The Support Vector Machine (SVM) algorithm has already been explained well in this tutorial by Jason Brownlee, but let’s first start with brushing up some of the most important points from his tutorial:

  • For simplicity, let’s say that we have two separate classes, 0 and 1. The data points contained within these two classes can be separated by a hyperplane, which is the decision boundary that splits the input space to separate the data points by their class. The dimension of this hyperplane depends on the dimensionality of the input data points.
  • If given a newly observed data point, we may find the class to which it belongs by calculating which side of the hyperplane it falls. 
  • A margin is the distance between the decision boundary and the closest data points. It is found by considering only the closest data points belonging to the different classes and is calculated as the perpendicular distance of these closest data points to the decision boundary.
  • The optimal decision boundary is characterized by the largest margin to the closest data points. These closest data points are known as the support vectors. 
  • If the classes are not perfectly separable from one another because they may be distributed so that some of their data points intermingle in space, the constraint of maximizing the margin needs to be relaxed. The margin constraint can be relaxed by introducing a tunable parameter known as C.
  • The value of the C parameter controls how much the margin constraint can be violated, with a value of 0 meaning that no violation is permitted at all. The aim of increasing the value of C is to reach a better compromise between maximizing the margin and reducing the number of misclassifications. 
  • Furthermore, the SVM uses a kernel to compute a similarity (or distance) measure between the input data points. In the simplest case, the kernel implements a dot product operation when the input data is linearly separable and can be separated by a linear hyperplane. 
  • If the data points are not linearly separable straight away, the kernel trick comes to the rescue, where the operation performed by the kernel seeks to transform the data to a higher-dimensional space in which it becomes linearly separable. This is analogous to the SVM finding a non-linear decision boundary in the original input space. 

Discovering the SVM algorithm in OpenCV

Let’s first consider applying the SVM to a simple linearly separable dataset that enables us to visualize several of the abovementioned concepts before moving on to more complex tasks. 

For this purpose, we shall be generating a dataset consisting of 100 data points (specified by n_samples), which are equally divided into 2 Gaussian clusters (specified by centers) having a standard deviation set to 1.5 (specified by cluster_std). To be able to replicate the results, let’s also define a value for random_state, which we’re going to set to 15:

The code above should generate the following plot of data points. You may note that we are setting the color values to the ground truth labels to be able to distinguish between data points belonging to the two different classes:

Linearly Separable Data Points Belonging to Two Different Classes

The next step is to split the dataset into training and testing sets, where the former will be used to train the SVM, and the latter to test it:

Splitting the Data Points in Training and Testing Sets

 We may see from the image of the training data above that the two classes are clearly distinguishable and should be easily separated by a linear hyperplane. Hence, let’s proceed to create and train an SVM in OpenCV that makes use of a linear kernel to find the optimal decision boundary between these two classes:

Here, note that the SVM’s train method in OpenCV requires the input data to be of type 32-bit float. 

We may proceed to use the trained SVM to predict labels for the testing data and subsequently calculate the classifier’s accuracy by comparing the predictions with their corresponding ground truth:

As expected, all of the testing data points have been correctly classified. Let’s also visualize the decision boundary computed by the SVM algorithm during training to understand better how it arrived at this classification result. 

In the meantime, the code listing so far is as follows:

To visualize the decision boundary, we will be creating many two-dimensional points structured into a rectangular grid, which span the space occupied by the data points used for testing:

Next, we shall organize the x- and y-coordinates of the data points that make up the rectangular grid into a two-column array and pass them on to the predict method to generate a class label for each one of them:

We may finally visualize them by a contour plot overlayed with the data points used for testing to confirm that, indeed, the decision boundary computed by the SVM algorithm is linear:

Linear Decision Boundary Computed by the SVM

We may also confirm from the figure above that, as mentioned in the first section, the testing data points have been assigned a class label depending on the side of the decision boundary they were found on. 

Furthermore, we may highlight the training data points that have been identified as the support vectors and which have played an instrumental role in determining the decision boundary:

Support Vectors Highlighted in Red

The complete code listing to generate the decision boundary and visualize the support vectors is as follows:

So far, we have considered the simplest case of having two well-distinguishable classes. But how do we distinguish between classes that are less clearly separable because they consist of data points that intermingle in space, such as the following:

Non-Linearly Separable Data Points Belonging to Two Different Classes

Splitting the Non-Linearly Separable Data in Training and Testing Sets

In this case, we might wish to explore different options depending on how much the two classes overlap one another, such as (1) relaxing the margin constraint for the linear kernel by increasing the value of the C parameter to allow for a better compromise between maximizing the margin and reducing misclassifications, or (2) using a different kernel function that can produce a non-linear decision boundary, such as the Radial Basis Function (RBF). 

In doing so, we need to set the values of a few properties of the SVM and the kernel function in use:

  • SVM_C_SVC: Known as C-Support Vector Classification, this SVM type allows an n-class classification (n $\geq$ 2) of classes with imperfect separation (i.e. not linearly separable). Set using the setType method. 
  • C: Penalty multiplier for outliers when dealing with non-linearly separable classes. Set using the setC method. 
  • Gamma: Determines the radius of the RBF kernel function. A smaller gamma value results in a wider radius that can capture the similarity of data points far from each other but may result in overfitting. A larger gamma results in a narrower radius that can only capture the similarity of nearby data points, which may result in underfitting. Set using the setGamma method. 

Here, the C and gamma values are being set arbitrarily, but you may conduct further testing to investigate how different values affect the resulting prediction accuracy. Both of the aforementioned options give us a prediction accuracy of 85% using the following code, but achieve this accuracy through different decision boundaries:

  • Using a linear kernel with a relaxed margin constraint:

Decision Boundary Computed Using a Linear Kernel with Relaxed Margin Constraints

  • Using an RBF kernel function:

Decision Boundary Computed Using an RBF Kernel

The choice of values for the SVM parameters typically depends on the task and the data at hand and require further testing to be tuned accordingly. 

Further Reading

This section provides more resources on the topic if you want to go deeper.




In this tutorial, you learned how to apply OpenCV’s Support Vector Machine algorithm on a custom two-dimensional dataset.

Specifically, you learned:

  • Several of the most important characteristics of the Support Vector Machine algorithm.
  • How to use the Support Vector Machine algorithm on a custom dataset in OpenCV.

Do you have any questions?

Ask your questions in the comments below, and I will do my best to answer.

, , ,

No comments yet.

Leave a Reply