Supervised and Unsupervised Machine Learning Algorithms

What is supervised machine learning and how does it relate to unsupervised machine learning?

In this post you will discover supervised learning, unsupervised learning and semis-supervised learning. After reading this post you will know:

  • About the classification and regression supervised learning problems.
  • About the clustering and association unsupervised learning problems.
  • Example algorithms used for supervised and unsupervised problems.
  • A problem that sits in between supervised and unsupervised learning called semi-supervised learning.

Let’s get started.

Supervised and Unsupervised Machine Learning Algorithms

Supervised and Unsupervised Machine Learning Algorithms
Photo by US Department of Education, some rights reserved.

Supervised Machine Learning

The majority of practical machine learning uses supervised learning.

Supervised learning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output.

Y = f(X)

The goal is to approximate the mapping function so well that when you have new input data (x) that you can predict the output variables (Y) for that data.

It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process. We know the correct answers, the algorithm iteratively makes predictions on the training data and is corrected by the teacher. Learning stops when the algorithm achieves an acceptable level of performance.

Get your FREE Algorithms Mind Map

Machine Learning Algorithms Mind Map

Sample of the handy machine learning algorithms mind map.

I've created a handy mind map of 60+ algorithms organized by type.

Download it, print it and use it. 

Download For Free

Also get exclusive access to the machine learning algorithms email mini-course.



Supervised learning problems can be further grouped into regression and classification problems.

  • Classification: A classification problem is when the output variable is a category, such as “red” or “blue” or “disease” and “no disease”.
  • Regression: A regression problem is when the output variable is a real value, such as “dollars” or “weight”.

Some common types of problems built on top of classification and regression include recommendation and time series prediction respectively.

Some popular examples of supervised machine learning algorithms are:

  • Linear regression for regression problems.
  • Random forest for classification and regression problems.
  • Support vector machines for classification problems.

Unsupervised Machine Learning

Unsupervised learning is where you only have input data (X) and no corresponding output variables.

The goal for unsupervised learning is to model the underlying structure or distribution in the data in order to learn more about the data.

These are called unsupervised learning because unlike supervised learning above there is no correct answers and there is no teacher. Algorithms are left to their own devises to discover and present the interesting structure in the data.

Unsupervised learning problems can be further grouped into clustering and association problems.

  • Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior.
  • Association:  An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y.

Some popular examples of unsupervised learning algorithms are:

  • k-means for clustering problems.
  • Apriori algorithm for association rule learning problems.

Semi-Supervised Machine Learning

Problems where you have a large amount of input data (X) and only some of the data is labeled (Y) are called semi-supervised learning problems.

These problems sit in between both supervised and unsupervised learning.

A good example is a photo archive where only some of the images are labeled, (e.g. dog, cat, person) and the majority are unlabeled.

Many real world machine learning problems fall into this area. This is because it can be expensive or time-consuming to label data as it may require access to domain experts. Whereas unlabeled data is cheap and easy to collect and store.

You can use unsupervised learning techniques to discover and learn the structure in the input variables.

You can also use supervised learning techniques to make best guess predictions for the unlabeled data, feed that data back into the supervised learning algorithm as training data and use the model to make predictions on new unseen data.


In this post you learned the difference between supervised, unsupervised and semi-supervised learning. You now know that:

  • Supervised: All data is labeled and the algorithms learn to predict the output from the input data.
  • Unsupervised: All data is unlabeled and the algorithms learn to inherent structure from the input data.
  • Semi-supervised: Some data is labeled but most of it is unlabeled and a mixture of supervised and unsupervised techniques can be used.

Do you have any questions about supervised, unsupervised or semi-supervised learning? Leave a comment and ask your question and I will do my best to answer it.

Frustrated With Machine Learning Math?

See How Algorithms Work in Minutes

...with just arithmetic and simple examples

Discover how in my new Ebook: Master Machine Learning Algorithms

It covers explanations and examples of 10 top algorithms, including:
Linear Regression, k-Nearest Neighbors, Support Vector Machines and much more...

Finally, Pull Back the Curtain on
Machine Learning Algorithms

Skip the Academics. Just Results.

Click to learn more.

26 Responses to Supervised and Unsupervised Machine Learning Algorithms

  1. Omot August 20, 2016 at 2:32 pm #

    Thanks for this post. That was helpful. My question is how does one determine the correct algorithm to use for a particular problem in supervised learning? Also,can a network trained by unsupervised learning be tested with new set of data (testing data) or its just for the purpose of grouping?

    • Jason Brownlee August 21, 2016 at 6:15 am #

      Hi Omot, it is a good idea to try a suite of standard algorithms on your problem and discover what algorithm performs best.

      Normally, an unsupervised method is applied to all data available in order to learn something about that data and the broader problem. You could say cluster a “training” dataset and later see what clusters new data is closest to if you wanted to avoid re-clustering the data.

    • angel November 22, 2016 at 9:58 am #

      I need help in solving a problem. I have utilized all resources available and the school can’t find a tutor in this subject. My question is this: What is the best method to choose if you want to train an algorithm that can discriminate between patients with hypertension and patients with hypertension and diabetes. Please help me understand!

  2. Pragya Poonia August 23, 2016 at 1:08 pm #

    This content is really helpful. Can you give some examples of all these techniques with best description?? or a brief introduction of Reinforcement learning with example??

  3. Naveen October 10, 2016 at 8:16 pm #

    Hi Jason,

    Thank you for summary on types of ML algorithms
    How can one use clustering or unsupervised learning for prediction on a new data. I have clustered the input data into clusters using hierarchical clustering, Now I want to check the membership of new data with the identified clusters. How is it possible. Is there an algorithm available in R?

    • Jason Brownlee October 11, 2016 at 7:23 am #

      Hi Naveen, generally I don’t use unsupervised methods much as I don’t get much value from them in practice.

      You can use the cluster number, cluster centroid or other details as an input for modeling.

  4. Tashrif October 25, 2016 at 9:03 am #

    Could you please give me a real world example of supervised, unsupervised, and semi supervised learning?

    • Jason Brownlee October 26, 2016 at 8:25 am #

      Hi Tashrif,

      Supervised would be when you have a ton of labeled pictures of dogs and cats and you want to automatically label new pictures of dogs and cats.

      Unsupervised would be when you want to see how the pictures structurally relate to each other by color or scene or whatever.

      Semi-supervised is where you have a ton of pictures and only some are labelled and you want to use the unlabeled and the labelled to help you in turn label new pictures in the future.

  5. Frank M November 12, 2016 at 7:38 am #

    This was a really good read, so thanks for writing and publishing it.

    Question for you. I have constructed a Random Forest model, so I’m using supervised learning, and I’m being asked to run an unlabeled data set through it. But I won’t have the actual results of this model, so I can’t determine accuracy on it until I have the actual result of it.

    So my question is… how can I run a set of data through a ML model if I don’t have labels for it?

    For further clarity and context, I’m running a random forest model to predict a binary classification label. I get the first few data points relatively quickly, but the label takes 30 days to become clear.

    Maybe none of this makes sense, but I appreciate any direction you could possibly give.

    Many thanks,


    • Jason Brownlee November 14, 2016 at 7:30 am #

      Thanks Frank. Great question.

      You will need to collect historical data to develop and evaluate your model.

      Once created, it sounds like you will need to wait 30 days before you can evaluate the ongoing performance of the model’s predictions.

  6. Ann November 17, 2016 at 8:29 pm #

    Hi Jason,
    Have done a program to classify if a customer(client) will subscribe for term deposit or not..
    dataset used: bank dataset from uci machine learning repository
    algorithm used: 1. random forest algorithm with CART to generate decision trees and 2.random forest algorithm with HAC4.5 to generate decision trees.

    my question is how do i determine the accuracy of 1 and 2 and find the best one???

    am really new to this field..please ignore my stupidity
    thanks in advance

  7. Nihad Almahrooq December 1, 2016 at 6:17 pm #

    Hi Jason, greater work you are making I wish you the best you deserving it.

    My question: I want to use ML to solve problems of network infrastructure data information. You know missing, typo, discrepancy. Fundamentals in knowledge and expertise are essential though need some ML direction and research more. Can you provide or shed light off that? And how? If you prefer we can communicate directly at

    Thanks and please forgive me if the approach seems awkward as startup and recently joint your connections it’s may be rushing!

    • Jason Brownlee December 2, 2016 at 8:14 am #

      Hi Nihad, that is an interesting application.

      Machine learning might not be the best approach for fixing typos and such. Nevertheless, the first step would be to collect a dataset and try to deeply understand the types of examples the algorithm would have to learn.

      This post might help you dive deeper into your problem:

      I hope this helps as a start, best of luck.

  8. Nischay December 24, 2016 at 8:11 am #

    Splendid work! A helpful measure for my semester exams. Thanks!!

  9. Sam January 1, 2017 at 4:11 am #

    hello Jason, greater work you are making I wish you the best you deserving it.
    I want to find an online algorithm to cluster scientific workflow data to minimize run time and system overhead so it can map these workflow tasks to a distributed resources like clouds .The clustered data should be mapped to these available resources in a balanced way that guarantees no resource is over utilized while other resource is idle.

    I came a cross a horizontal clustering ,vertical clustering but these technique are static and user should determine the number of clusters and number of tasks in each cluster in advance …

    • Jason Brownlee January 1, 2017 at 5:25 am #

      Hi Sam,

      Thanks for your support.

      Off-the-cuff, this sounds like a dynamic programming or constraint satisfaction problem rather than machine learning.

  10. Marcus January 6, 2017 at 6:55 am #

    Hi Jason, this post is really helpful for my Cognitive Neural Network revision!

    I have a question of a historical nature, relating to how supervised learning algorithms evolved:
    Some early supervised learning methods allowed the threshold to be adjusted during learning. Why is that not necessary with the newer supervised learning algorithms?

    Is this because they (e.g. the Delta Rule) adjust the weights on a running basis to minimize error, which supersedes the need for threshold adjustment? Or is there something more subtle going on in the newer algorithms that eliminates the need for threshold adjustment? Thank you in advance for any insight you can provide on this.

    • Jason Brownlee January 6, 2017 at 9:14 am #

      I don’t think I have enough context Marcus. It sounds like you may be referring specifically to stochastic gradient descent.

      I’m not really an algorithm historian, I’d refer you to the seminal papers on the topic.

  11. David Lehmann February 17, 2017 at 3:52 am #

    Hi Jason – Thanks so much for the informative post. I think I am missing something basic. Once a model is trained with labeled data (supervised), how does additional unlabeled data help improve the model? For example, how do newly uploaded pictures (presumably unlabeled) to Google Photos help further improve the model (assuming it does so)? Or how does new voice data (again unlabeled) help make a machine learning-based voice recognition system better? i understand conceptually how labeled data could drive a model but unclear how it helps if you don’t really know what the data represents.

    Thanks! Dave

    • Jason Brownlee February 17, 2017 at 10:01 am #

      Great question Dave.

      Generally, we can use unlabelled data to help initialize large models, like deep neural networks.

      More specifically, we can label unlabelled data, have it corroborate the prediction if needed, and use that as input to update or retrain a model to make be better for future predictions.

      Does that help?

      • Dave Lehmann February 18, 2017 at 2:50 am #

        yes thanks. So the data ultimately needs to be labeled to be useful in improving the model? Keeping with the Google Photos use case, all the millions of photos uploaded everyday then doesn’t help the model unless someone manually labels them and then runs those through the training? Guess I was hoping there was some way intelligence could be discerned from the unlabeled data (unsupervised) to improve on the original model but that does not appear to be the case right? thanks again for the help – Dave

        • Jason Brownlee February 18, 2017 at 8:43 am #

          There very well may be, I’m just not across it.

Leave a Reply