Gentle Introduction to Predictive Modeling

When you’re an absolute beginner it can be very confusing. Frustratingly so.

Even ideas that seem so simple in retrospect are alien when you first encounter them. There’s a whole new language to learn.

I recently received this question:

So using the iris exercise as an example if I were to pluck a flower from my garden how would I use the algorithm to predict what it is?

It’s a great question.

In this post I want to give a gentle introduction to predictive modeling.

Basics of Predictive Modeling

Basics of Predictive Modeling
Photo by Steve Jurvetson, some rights reserved.

1. Sample Data

Data is information about the problem that you are working on.

Imagine we want to identify the species of flower from the measurements of a flower.

The data is comprised of four flower measurements in centimeters, these are the columns of the data.

Each row of data is one example of a flower that has been measured and it’s known species.

The problem we are solving is to create a model from the sample data that can tell us which species a flower belongs to from its measurements alone.

Sample of Iris flower data

Sample of Iris flower data

2. Learn a Model

This problem described above is called supervised learning.

The goal of a supervised learning algorithm is to take some data with a known relationship (actual flower measurements and the species of the flower) and to create a model of those relationships.

In this case the output is a category (flower species) and we call this type of problem a classification problem. If the output was a numerical value, we would call it a regression problem.

The algorithm does the learning. The model contains the learned relationships.

The model itself may be a handful of numbers and way of using those numbers to relate input (flower measurements in centimeters) to an output (the species of flower).

We want to keep the model after we have learned it from our sample data.

Create a Predictive Model

Create a predictive model from training data and an algorithm.

3. Make Predictions

We don’t need to keen the training data as the model has summarized the relationships contained within it.

The reason we keep the model learned from data is because we want to use it to make predictions.

In this example, we use the model by taking measurements of specific flowers of which don’t know the species.

Our model will read the input (new measurements), perform a calculation of some kind with it’s internal numbers and make a prediction about which species of flower it happens to be.

The prediction may not be perfect, but if you have good sample data and a robust model learned from that data, it will be quite accurate.

Make Predictions

Use the model to make predictions on new data.

Summary

In this post we have taken a very gentle introduction to predictive modeling.

The three aspects of predictive modeling we looked at were:

  1. Sample Data: the data that we collect that describes our problem with known relationships between inputs and outputs.
  2. Learn a Model: the algorithm that we use on the sample data to create a model that we can later use over and over again.
  3. Making Predictions: the use of our learned model on new data for which we don’t know the output.

We used the example of classifying plant species based on flower measurements.

This is in fact a famous example in machine learning because it’s a good clean dataset and the problem is easy to understand.

Action Step

Take a moment and really understand these concepts.

They are the foundation of any thinking or work that you might do in machine learning.

Your action step is to think through the three aspects (data, model, predictions) and relate them to a problem that you would like to work on.

Any questions at all, please ask in the comments. I’m here to help.

45 Responses to Gentle Introduction to Predictive Modeling

  1. shahzad badar September 8, 2015 at 5:37 pm #

    Indeed a very clear, concise high level overview of Machine learning based predictive modeling, great read, looking forward to subsequent reads that will be focusing on model/hypothesis creation based on data, again at a high level of abstraction leaving out the statistical wizardry for advance reads

  2. Rhymeface September 8, 2015 at 10:26 pm #

    So are there other ways for doing predictive modeling that do not rely on machine learning?

    And additionally: all machine learning algorithms (neural networks, decision trees, SVMs, …) can be considered as part of predictive modeling?

    • Jason Brownlee September 9, 2015 at 5:15 am #

      Good point Rhymeface.

      Machine Learning is the set of tools we use to create our predictive models. We don’t have to use machine learning. For example, the simplest type of prediction is to use the mean value.
      I would rephrase it as predictive modeling is the most common type of problem that we solve with machine learning (e.g. classification and regression problems).

  3. Aiswarya September 9, 2015 at 6:36 pm #

    What other ways are there to donpredictive modelling?

    • Jason Brownlee September 13, 2015 at 9:02 am #

      Sure, if we have regression dataset, I could give the mean value seen so far or the last value seen as a prediction of what to expect next.

      In a classification problem, we could estimate the class as the most frequent class observed.

      These methods are pure stats and generally uninteresting, but are examples of predictive modeling without using machine learning.

  4. Paul S September 12, 2015 at 10:16 pm #

    Your post was very clear and i’m excited to read more from your site. I’m new to machine learning and coding. I have very minimal experience in data visualization. But I have rudimentary knowledge in statistics. I guess at this point, i’d like to know where I should start in learning to create algorithms to learn datasets. thanks!!!

    • Jason Brownlee September 13, 2015 at 5:47 am #

      Great to have you here Paul.

      I hope that I can help you on your machine learning journey.

  5. Jan September 22, 2015 at 11:45 pm #

    Hi,

    Which learning machines can be adopted for prediction?

    I have read your folowing article
    (http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/#comment-316878)

    Can I adopt all the algorithms that you mentioned ? and where are positioned the support vector machines in your list?

    regards

  6. Ramesh November 12, 2015 at 12:36 am #

    Hello Jason

    This is a good article. Do you have example that shows model created from training data. would we use the machine learning algorithms to create predictive model or we use algorithms after model is created with the new data.

    I am new to machine learning and exploring to use it for fault detection problem.

  7. Justin Fong September 3, 2016 at 8:58 am #

    This was a great post, thanks. I am trying to become a DS and am taking IBMs Big Data University and needed this portion on what is predictive modeling cleared up.

  8. Harshitha P K September 30, 2016 at 8:59 pm #

    This a good post, thank you. If anyone could help me develop a model or an algorithm for a weighted moving window the data samples in the best possible way, it would be really very helpful.

  9. koti November 11, 2016 at 11:14 pm #

    I never read such an amazing post like this! I completely understand the topic in one go. Thanks for the post!!

  10. Stanford August 4, 2017 at 5:22 am #

    This is a good post i must say. I am new to machine learning and exploring to use it for career match or match-making problem. I would like to know which algorithm and technique to use in this type of a problem?

  11. Chanyawat February 28, 2018 at 8:48 pm #

    Could you tell me what is Machine Learning from Predictive Modeling from?

    • Jason Brownlee March 1, 2018 at 6:13 am #

      Machine learning algorithms can be used to develop predictive models.

  12. Chioma March 1, 2018 at 12:24 am #

    This is good news for me. A beginner, and this is what i need. I am still struggling digest it well based on my work — Building a morphological analyser for my language. Needs a more clear direction. Thanks Jason you are a blessing!

  13. Cristiano April 10, 2018 at 11:08 pm #

    Thank you Jason for taking the time to put together this resource, it’s been helpful and really interesting to learn from them.

  14. Sharath May 11, 2018 at 5:50 pm #

    Hello Jason,

    Thank for the write up, I thoroughly enjoyed reading this. I’m itching to read more. Please guide further steps. I enjoy your lessons of ML and I’m getting my hands dirty 🙂

    Cheers Jason!

  15. Amo May 19, 2018 at 3:40 pm #

    Thanks a lot Jason. You keep these posts as simple as they can get but you don’t leave out the most important info needed to get deeper into the subject.

  16. kumanan May 25, 2018 at 5:31 pm #

    How can I apply these to a Beauty salon
    Please give some insights

  17. Mahesh May 29, 2018 at 4:42 pm #

    Hi Jason – I’m slightly confused between ML Model and ML Algorithm. I tend to use these 2 words interchangeably, which may be wrong. Can you please explain with an example?

  18. Murtadha July 3, 2018 at 3:57 pm #

    I ma bright new in predictive modeling from which book I can start to understand modeling.
    i have bachelor degree in engineering so I have learned the basic

  19. Vajradehi September 4, 2018 at 10:19 pm #

    Hi Jason,
    It is very good article I am planning to use Machine learning for mechanical assembly of different parts. Suggesting user which parts need to be picked for easy assembly.

    Can you please share your experience in sequencing. I want to use combination of sequencing with predictive algorithm. Am I thinking in correct direction?

    Regards,

    Vajradehi Yadav

  20. Giang Nguyen October 3, 2018 at 8:12 pm #

    Hi Jason,

    As far as i know, there are two kind of applications that machine learning can be helpful: regression and classification. So how can I use machine learning for regression (not linear regression).

    Thank you very much!!

  21. Izoduwa October 6, 2018 at 11:42 am #

    which of your book contain this explanation

    • Jason Brownlee October 6, 2018 at 11:44 am #

      I don’t have a book on the absolute basic concepts of machine learning. I focus on teaching how to “do” machine learning.

  22. Krishna October 16, 2018 at 6:07 pm #

    Excellent article Jason.

    Please write one article on deploying Machine Learning models in Production

  23. Narendra November 1, 2018 at 6:17 pm #

    Hi Sir Jason Brownlee after reading your blogs I moved my self into an exactly correct direction to do expertise in ML.
    Thank you so much

  24. Asad Khan November 29, 2018 at 12:44 am #

    Dear Jason after reading your books and articles I became expert in machine learning. Kindly write some about big data analytics.

    Best Regards:
    Asad Khan

Leave a Reply