Difference Between Classification and Regression in Machine Learning

There is an important difference between classification and regression problems.

Fundamentally, classification is about predicting a label and regression is about predicting a quantity.

I often see questions such as:

How do I calculate accuracy for my regression problem?

Questions like this are a symptom of not truly understanding the difference between classification and regression and what accuracy is trying to measure.

In this tutorial, you will discover the differences between classification and regression.

After completing this tutorial, you will know:

  • That predictive modeling is about the problem of learning a mapping function from inputs to outputs called function approximation.
  • That classification is the problem of predicting a discrete class label output for an example.
  • That regression is the problem of predicting a continuous quantity output for an example.

Let’s get started.

Difference Between Classification and Regression in Machine Learning

Difference Between Classification and Regression in Machine Learning
Photo by thomas wilson, some rights reserved.

Tutorial Overview

This tutorial is divided into 5 parts; they are:

  1. Function Approximation
  2. Classification
  3. Regression
  4. Classification vs Regression
  5. Converting Between Classification and Regression Problems

Function Approximation

Predictive modeling is the problem of developing a model using historical data to make a prediction on new data where we do not have the answer.

For more on predictive modeling, see the post:

Predictive modeling can be described as the mathematical problem of approximating a mapping function (f) from input variables (X) to output variables (y). This is called the problem of function approximation.

The job of the modeling algorithm is to find the best mapping function we can given the time and resources available.

For more on approximating functions in applied machine learning, see the post:

Generally, we can divide all function approximation tasks into classification tasks and regression tasks.

Classification Predictive Modeling

Classification predictive modeling is the task of approximating a mapping function (f) from input variables (X) to discrete output variables (y).

The output variables are often called labels or categories. The mapping function predicts the class or category for a given observation.

For example, an email of text can be classified as belonging to one of two classes: “spam and “not spam“.

  • A classification problem requires that examples be classified into one of two or more classes.
  • A classification can have real-valued or discrete input variables.
  • A problem with two classes is often called a two-class or binary classification problem.
  • A problem with more than two classes is often called a multi-class classification problem.
  • A problem where an example is assigned multiple classes is called a multi-label classification problem.

It is common for classification models to predict a continuous value as the probability of a given example belonging to each output class. The probabilities can be interpreted as the likelihood or confidence of a given example belonging to each class. A predicted probability can be converted into a class value by selecting the class label that has the highest probability.

For example, a specific email of text may be assigned the probabilities of 0.1 as being “spam” and 0.9 as being “not spam”. We can convert these probabilities to a class label by selecting the “not spam” label as it has the highest predicted likelihood.

There are many ways to estimate the skill of a classification predictive model, but perhaps the most common is to calculate the classification accuracy.

The classification accuracy is the percentage of correctly classified examples out of all predictions made.

For example, if a classification predictive model made 5 predictions and 3 of them were correct and 2 of them were incorrect, then the classification accuracy of the model based on just these predictions would be:

An algorithm that is capable of learning a classification predictive model is called a classification algorithm.

Regression Predictive Modeling

Regression predictive modeling is the task of approximating a mapping function (f) from input variables (X) to a continuous output variable (y).

A continuous output variable is a real-value, such as an integer or floating point value. These are often quantities, such as amounts and sizes.

For example, a house may be predicted to sell for a specific dollar value, perhaps in the range of $100,000 to $200,000.

  • A regression problem requires the prediction of a quantity.
  • A regression can have real valued or discrete input variables.
  • A problem with multiple input variables is often called a multivariate regression problem.
  • A regression problem where input variables are ordered by time is called a time series forecasting problem.

Because a regression predictive model predicts a quantity, the skill of the model must be reported as an error in those predictions.

There are many ways to estimate the skill of a regression predictive model, but perhaps the most common is to calculate the root mean squared error, abbreviated by the acronym RMSE.

For example, if a regression predictive model made 2 predictions, one of 1.5 where the expected value is 1.0 and another of 3.3 and the expected value is 3.0, then the RMSE would be:

A benefit of RMSE is that the units of the error score are in the same units as the predicted value.

An algorithm that is capable of learning a regression predictive model is called a regression algorithm.

Some algorithms have the word “regression” in their name, such as linear regression and logistic regression, which can make things confusing because linear regression is a regression algorithm whereas logistic regression is a classification algorithm.

Classification vs Regression

Classification predictive modeling problems are different from regression predictive modeling problems.

  • Classification is the task of predicting a discrete class label.
  • Regression is the task of predicting a continuous quantity.

There is some overlap between the algorithms for classification and regression; for example:

  • A classification algorithm may predict a continuous value, but the continuous value is in the form of a probability for a class label.
  • A regression algorithm may predict a discrete value, but the discrete value in the form of an integer quantity.

Some algorithms can be used for both classification and regression with small modifications, such as decision trees and artificial neural networks. Some algorithms cannot, or cannot easily be used for both problem types, such as linear regression for regression predictive modeling and logistic regression for classification predictive modeling.

Importantly, the way that we evaluate classification and regression predictions varies and does not overlap, for example:

  • Classification predictions can be evaluated using accuracy, whereas regression predictions cannot.
  • Regression predictions can be evaluated using root mean squared error, whereas classification predictions cannot.

Convert Between Classification and Regression Problems

In some cases, it is possible to convert a regression problem to a classification problem. For example, the quantity to be predicted could be converted into discrete buckets.

For example, amounts in a continuous range between $0 and $100 could be converted into 2 buckets:

  • Class 0: $0 to $49
  • Class 1: $50 to $100

This is often called discretization and the resulting output variable is a classification where the labels have an ordered relationship (called ordinal).

In some cases, a classification problem can be converted to a regression problem. For example, a label can be converted into a continuous range.

Some algorithms do this already by predicting a probability for each class that in turn could be scaled to a specific range:

Alternately, class values can be ordered and mapped to a continuous range:

  • $0 to $49 for Class 1
  • $50 to $100 for Class 2

If the class labels in the classification problem do not have a natural ordinal relationship, the conversion from classification to regression may result in surprising or poor performance as the model may learn a false or non-existent mapping from inputs to the continuous output range.

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Summary

In this tutorial, you discovered the difference between classification and regression problems.

Specifically, you learned:

  • That predictive modeling is about the problem of learning a mapping function from inputs to outputs called function approximation.
  • That classification is the problem of predicting a discrete class label output for an example.
  • That regression is the problem of predicting a continuous quantity output for an example.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

68 Responses to Difference Between Classification and Regression in Machine Learning

  1. Rizwan ali December 11, 2017 at 7:14 am #

    nice post to clear the basic concepts

  2. shivaprasad December 11, 2017 at 6:47 pm #

    Really a good one sir,i had little bit confusion regarding these two.Thank you very much

  3. Kishore December 11, 2017 at 8:54 pm #

    Very specific and crystal clear.. we need more of these in your books too 🙂

  4. James December 13, 2017 at 12:02 am #

    “Some algorithms have the word “regression” in their name, such as linear regression and logistic regression, which can make things confusing because linear regression is a regression algorithm whereas logistic regression is a classification algorithm”

    Thank you for that clarification, even after all this time in ML that always nagged at me.\

  5. Andrey Koch December 15, 2017 at 5:36 am #

    This topic is known to me yet the article made it straight to the point, thanks!

    One thing, in “A classification can have real-valued or discrete input variables.” you probably meant continious vs. discrete

  6. Derek Amah December 15, 2017 at 5:44 am #

    Very basic and insightful. I experience same misunderstanding from people when hiring talent and sometimes explaining these concepts to people. Will take a cue from your explanation going forward. Thanks !!!!

  7. Monsef December 15, 2017 at 6:12 am #

    Valuable tutorial Jason

    Thank you very much

  8. youssef December 15, 2017 at 8:25 am #

    Merci bien pour votre clarification de la différences entre les deux types ..

  9. Manika.rao December 15, 2017 at 4:15 pm #

    Thank a lot Sir.its indeed a good post for basics on ml….

  10. JP December 16, 2017 at 3:20 am #

    Hey Jason,

    This is really helpful for beginners.
    Could you provide single example of following section:
    “Convert Between Classification and Regression Problems”

  11. MJ December 18, 2017 at 7:13 am #

    Great explanation of important concepts. Liked your comment on logistic regresión being a classification problem. The regression in the name does make it confusing when starting out.

  12. Idrees December 18, 2017 at 5:32 pm #

    I’m actually getting to focus on ML, absolutely new to it. Thanks for your eye-opening and insightful explanations

  13. alejandro Camargo December 20, 2017 at 5:31 am #

    Thank you!

  14. Mark December 29, 2017 at 8:14 am #

    How is it possible to reverse the process of discretizing data given that there is a loss of information? For example, following the same example above values of $0 to $49 would be represented by categorical value of 0.

    Can’t figure out how the reverse is possible without knowing original value (in this case price in USD).

  15. Kamran January 1, 2018 at 8:45 am #

    Any help about CNN and RNN implementation using pima indian dataset in python

  16. JOHN JEFFRY MENDEZ January 22, 2018 at 8:01 pm #

    Great blog! could u please explain how to determine the variance and bias for a model prediction?

  17. doupanpan January 25, 2018 at 5:11 am #

    Thanks for the post, clearly explained. Also, I am thinking about using Python to try these classification. regression, models etc. is there a useful online tutorials to follow ?

  18. Oussama January 25, 2018 at 7:37 am #

    Nice explanation. thanks.

  19. Dhanashree February 1, 2018 at 9:08 pm #

    Thanks very well explained. !!

  20. Phil Mckay February 6, 2018 at 1:56 am #

    Hi Jason:

    I am a patent attorney, used to be a physics animal. I have written several hundred patent applications directed to software, mostly cyber security and cloud security system patents. Lately, like in every field, AI and ML in particular is everywhere. I am not afraid of the math, but the vocabulary was an issue – until I read your post.

    I can’t tell you how much I appreciated that. As you know, vocabulary, at least consistent vocabulary usage, is a bit of an issue in the data science and software worlds. Your post cleared up many questions in just a few minutes of reading – THANK YOU!

    You are an awesome writer and teacher

  21. Wafa February 9, 2018 at 4:04 am #

    Hi,

    Thank you for your great tutorials.
    Could you please give us a tutorial on how to classify images using transfer learning and Tensorflow?
    Or guide me where I can find great tutorial about it like yours?

  22. Ali Shan February 21, 2018 at 6:05 am #

    Appreciate your effort and clearly helpful.
    thanks !

  23. hans March 5, 2018 at 6:39 am #

    simple and valuable , thanks.

  24. KK March 15, 2018 at 6:06 pm #

    very well explained, actually the solution of my problem.
    “How do I calculate accuracy for my regression problem?” this was my actual question but now it’s clear.
    Thanks man !!

  25. Frederick Alfhendra March 30, 2018 at 2:58 am #

    Hello Sir, did you had any suggestion regarding the reference such as Journal or Book that I could use to explain about Regression Predictive Modeling?

    Thank you

  26. Frederick Alfhendra March 30, 2018 at 12:50 pm #

    The question will be, “why we use Regression Predictive Modeling for Stock index forecasting”, any suggestion will be really appreciated, thank you, Sir.

  27. Abhijeet April 1, 2018 at 4:57 pm #

    Thanks Jason for amazing article…you Rock Jason !!!

  28. disouja April 12, 2018 at 7:38 pm #

    please clear with example that what is called classification and what is regression problem. how to get to know the problem is a classification problem or a regression problem.

    • Jason Brownlee April 13, 2018 at 6:39 am #

      Classification is about predicting a label (e.g. ‘red’). Regression is about predicting a quantity (e.g. 100).

      Does that help?

  29. Julio Lee April 24, 2018 at 1:28 pm #

    Fantastic post, thank you for sharing! I was recently training a model as a binary classification problem using sigmoid as a single output. However, I was able to get far better results using MSE rather than binary cross-entropy. Since MSE is mostly used for regression, does this mean I was forced to convert it to a regression problem? This has been tickling my brain for quite some time now….

    • Jason Brownlee April 24, 2018 at 2:51 pm #

      Thanks.

      MSE with a sigmoid output function? Wow, and it worked without error?

      Be careful when evaluating the skill of the model, ensure it is doing what you think.

  30. Rupesh May 2, 2018 at 4:22 pm #

    Thanks………

  31. Danilo May 9, 2018 at 3:25 am #

    Excellent post! I appreciate too much your work Ph.D. Jason Brownlee.
    I have a question. What type of task should I perform if my dependent variable observations are dichotomous, but I need to infer continuous values?

    • Jason Brownlee May 9, 2018 at 6:27 am #

      If you have categorical inputs and require a real-valued output, this sounds like regression, and a challenging case.

      Perhaps you can try modeling it directly as a regression problem and see how you go. You may want to integer encode or one hot encode the inputs.

      • Danilo May 9, 2018 at 7:25 am #

        Thank you very much for your help!
        I have used two variants to determine my continuous output:
        1. The probability of belonging to the positive class in a classification model.
        2. The output of a regression model.
        The accuracy in terms of a CMC curve for the classification model outperformed the regression model. But I am not sure if I have misunderstood some results.
        Is it feasible to assume the probability of belonging to the positive class in a classification model as the similarity to this class? When the classifier outputs the probability (p) to belong to the negative class, I computed the probability to belong to the positive class as 1-p.

        • Jason Brownlee May 9, 2018 at 2:55 pm #

          Predicting class probability is a classification problem. Some algorithms can predict a probability.

          • Danilo May 9, 2018 at 3:17 pm #

            Thank you very much, Dr. Jason Brownlee! Your advice is helpful for my work. You are very gentle sharing your great knowledge!!!!

          • Jason Brownlee May 10, 2018 at 6:26 am #

            I’m glad it helped.

Leave a Reply