Logistic Regression in OpenCV

Logistic regression is a simple but popular machine learning algorithm for binary classification that uses the logistic, or sigmoid, function at its core. It also comes implemented in the OpenCV library.

In this tutorial, you will learn how to apply OpenCV’s logistic regression algorithm, starting with a custom two-class dataset that we will generate ourselves. We will then apply these skills for the specific image classification application in a subsequent tutorial. 

After completing this tutorial, you will know:

  • Several of the most important characteristics of the logistic regression algorithm.
  • How to use the logistic regression algorithm on a custom dataset in OpenCV.

Kick-start your project with my book Machine Learning in OpenCV. It provides self-study tutorials with working code.


Let’s get started. 

Logistic Regression in OpenCV
Photo by Fabio Santaniello Bruun. Some rights reserved.

Tutorial Overview

This tutorial is divided into two parts; they are:

  • Reminder of What Logistic Regression Is
  • Discovering Logistic Regression in OpenCV

Reminder of What Logistic Regression Is

The topic surrounding logistic regression has already been explained well in these tutorials by Jason Brownlee [1, 2, 3], but let’s first start with brushing up on some of the most important points:

  • Logistic regression takes its name from the function used at its core, the logistic function (also known as the sigmoid function).  
  • Despite the use of the word regression in its name, logistic regression is a method for binary classification or, in simpler terms, problems with two-class values.
  • Logistic regression can be regarded as an extension of linear regression because it maps (or squashes) the real-valued output of a linear combination of features into a probability value within the range [0, 1] through the use of the logistic function. 
  • Within a two-class scenario, the logistic regression method models the probability of the default class. As a simple example, let’s say that we are trying to distinguish between classes of flowers A and B from their petal count, and we are taking the default class to be A. Then, for an unseen input X, the logistic regression model would give the probability of X belonging to the default class A:

$$ P(X) = P(A = 1 | X) $$

  • The input X is classified as belonging to the default class A if its probability P(X) > 0.5. Otherwise, it is classified as belonging to the non-default class B. 
  • The logistic regression model is represented by a set of parameters known as coefficients (or weights) learned from the training data. These coefficients are iteratively adjusted during training to minimize the error between the model predictions and the actual class labels. 
  • The coefficient values may be estimated during training using gradient descent or maximum likelihood estimation (MLE) techniques. 

Discovering Logistic Regression in OpenCV

Let’s start with a simple binary classification task before moving on to more complex problems. 

As we have already done in related tutorials through which we familiarised ourselves with other machine learning algorithms in OpenCV (such as the SVM algorithm), we shall be generating a dataset that comprises 100 data points (specified by n_samples), equally divided into 2 Gaussian clusters (specified by centers) having a standard deviation set to 5 (specified by cluster_std). To be able to replicate the results, we shall again exploit the random_state parameter, which we’re going to set to 15: 

The code above should generate the following plot of data points. You may note that we are setting the color values to the ground truth labels to be able to distinguish between data points belonging to the two different classes:

Data Points Belonging to Two Different Classes

The next step is to split the dataset into training and testing sets, where the former will be used to train the logistic regression model and the latter to test it:

Splitting the Data Points in Training and Testing Sets

The image above indicates that the two classes appear clearly distinguishable in the training and testing data. For this reason, we expect that this binary classification problem should be a straightforward task for the trained linear regression model. Let’s create and train a logistic regression model in OpenCV to eventually see how it performs on the testing part of the dataset.

The first step is to create the logistic regression model itself:

In the next step, we shall choose the training method by which we want the model’s coefficients to be updated during training. The OpenCV implementation lets us choose between two different methods: the Batch Gradient Descent and the Mini-Batch Gradient Descent methods. 

If the Batch Gradient Descent method is chosen, the model’s coefficients will be updated using the entire training dataset at each iteration of the gradient descent algorithm. If we are working with very large datasets, then this method of updating the model’s coefficients can become very computationally expensive. 

Want to Get Started With Machine Learning with OpenCV?

Take my free email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

A more practical approach to updating the model’s coefficients, especially when working with large datasets, is to opt for a Mini-Batch Gradient Descent method, which rather divides the training data into smaller batches (called mini-batches, hence the name of the method) and updates the model’s coefficients by processing one mini-batch at a time. 

We may check what OpenCV implements as its default training method by making use of the following line of code:

The returned value of 0 represents the Batch Gradient Descent method in OpenCV. If we want to change this to the Mini-Batch Gradient Descent method, we can do so by passing ml.LogisticRegression_MINI_BATCH to the setTrainMethod function, and then proceed to set the size of the mini-batch:

Setting the mini-batch size to 5 means that the training data will be divided into mini-batches containing 5 data points each, and the model’s coefficients will be updated iteratively after each of these mini-batches is processed in turn. If we had to set the size of the mini-batch to the total number of samples in the training dataset, this would effectively result in a Batch Gradient Descent operation since the entire batch of training data would be processed at once, at each iteration. 

Next, we shall define the number of iterations that we want to run the chosen training algorithm for, before it terminates:

We’re now set to train the logistic regression model on the training data:

As mentioned earlier, the training process aims to adjust the logistic regression model’s coefficients iteratively to minimize the error between the model predictions and the actual class labels. 

Each training sample we have fed into the model comprises two feature values, denoted by $x_1$ and $x_2$. This means that we should expect the model we have generated to be defined by two coefficients (one per input feature) and an additional coefficient that defines the bias (or intercept). 

Then the probability value, $\hat{y}$, returned the model can be defined as follows:

$$ \hat{y} = \sigma( \beta_0 + \beta_1 \; x_1 + \beta_2 \; x_2 ) $$

where $\beta_1$ and $\beta_2$ denote the model coefficients, $\beta_0$ the bias, and $\sigma$ the logistic (or sigmoid) function that is applied to the real-valued output of the linear combination of features. 

Let’s print out the learned coefficient values to see whether we retrieve as many as we expect:

We find that we retrieve three values as expected, which means that we can define the model that best separates between the two-class samples that we are working with by:

$$ \hat{y} = \sigma( -0.0241 – \; 0.3461 \; x_1 + 0.0848 \; x_2 ) $$

We can assign a new, unseen data point to either of the two classes by plugging in its feature values, $x_1$ and $x_2$, into the model above. If the probability value returned by the model is > 0.5, we can take it as a prediction for class 0 (the default class). Otherwise, it is a prediction for class 1. 

Let’s go ahead to see how well this model predicts the target class labels by trying it out on the testing part of the dataset:

We can plot out the ground truth against the predicted classes for the testing data, as well as print out the ground truth and predicted class labels, to investigate any misclassifications:

Test Data Points Belonging to Ground Truth and Predicted Classes, Where a Red Circle highlights a Misclassified Data Point

In this manner, we can see that one sample originally belonged to class 1 in the ground truth data but has been misclassified as belonging to class 0 in the model’s prediction. 

The entire code listing is as follows:

In this tutorial, we have considered setting values for two specific training parameters of the logistic regression model implemented in OpenCV. The parameters defined the training method to use and the number of iterations for which we wanted to run the chosen training algorithm during the training process. 

However, these are not the only parameter values that can be set for the logistic regression method. Other parameters, such as the learning rate and the type of regularization to perform, can also be modified to achieve better training accuracy. Hence, we suggest that you explore these parameters and investigate how different values can affect the model’s training and prediction accuracy. 

Further Reading

This section provides more resources on the topic if you want to go deeper.

Books

Websites

Summary

In this tutorial, you learned how to apply OpenCV’s logistic regression algorithm, starting with a custom two-class dataset we generated.

Specifically, you learned:

  • Several of the most important characteristics of the logistic regression algorithm.
  • How to use the logistic regression algorithm on a custom dataset in OpenCV.

Do you have any questions?

Ask your questions in the comments below, and I will do my best to answer.

Get Started on Machine Learning in OpenCV!

Machine Learning in OpenCV

Learn how to use machine learning techniques in image processing projects

...using OpenCV in advanced ways and work beyond pixels

Discover how in my new Ebook:
Machine Learing in OpenCV

It provides self-study tutorials with all working code in Python to turn you from a novice to expert. It equips you with
logistic regression, random forest, SVM, k-means clustering, neural networks, and much more...all using the machine learning module in OpenCV

Kick-start your deep learning journey with hands-on exercises


See What's Inside

, , ,

No comments yet.

Leave a Reply