Gentle Introduction to Predictive Modeling

By Jason Brownlee on July 22, 2020 in Start Machine Learning 79

When you’re an absolute beginner it can be very confusing. Frustratingly so.

Even ideas that seem so simple in retrospect are alien when you first encounter them. There’s a whole new language to learn.

I recently received this question:

So using the iris exercise as an example if I were to pluck a flower from my garden how would I use the algorithm to predict what it is?

It’s a great question.

In this post I want to give a gentle introduction to predictive modeling.

How to Develop an Auxiliary Classifier GAN (AC-GAN) From Scratch with Keras

Gentle Introduction to Predictive Modeling

1. Sample Data

Data is information about the problem that you are working on.

Imagine we want to identify the species of flower from the measurements of a flower.

The data is comprised of four flower measurements in centimeters, these are the columns of the data.

Each row of data is one example of a flower that has been measured and it’s known species.

The problem we are solving is to create a model from the sample data that can tell us which species a flower belongs to from its measurements alone.

Sample of Iris flower data

2. Learn a Model

This problem described above is called supervised learning.

The goal of a supervised learning algorithm is to take some data with a known relationship (actual flower measurements and the species of the flower) and to create a model of those relationships.

In this case the output is a category (flower species) and we call this type of problem a classification problem. If the output was a numerical value, we would call it a regression problem.

The algorithm does the learning. The model contains the learned relationships.

The model itself may be a handful of numbers and a way of using those numbers to relate input (flower measurements in centimeters) to an output (the species of flower).

We want to keep the model after we have learned it from our sample data.

Create a predictive model from training data and an algorithm.

3. Make Predictions

We don’t need to keep the training data as the model has summarized the relationships contained within it.

The reason we keep the model learned from data is because we want to use it to make predictions.

In this example, we use the model by taking measurements of specific flowers of which don’t know the species.

Our model will read the input (new measurements), perform a calculation of some kind with it’s internal numbers and make a prediction about which species of flower it happens to be.

The prediction may not be perfect, but if you have good sample data and a robust model learned from that data, it will be quite accurate.

Use the model to make predictions on new data.

Summary

In this post we have taken a very gentle introduction to predictive modeling.

The three aspects of predictive modeling we looked at were:

Sample Data: the data that we collect that describes our problem with known relationships between inputs and outputs.
Learn a Model: the algorithm that we use on the sample data to create a model that we can later use over and over again.
Making Predictions: the use of our learned model on new data for which we don’t know the output.

We used the example of classifying plant species based on flower measurements.

This is in fact a famous example in machine learning because it’s a good clean dataset and the problem is easy to understand.

Action Step

Take a moment and really understand these concepts.

They are the foundation of any thinking or work that you might do in machine learning.

Your action step is to think through the three aspects (data, model, predictions) and relate them to a problem that you would like to work on.

Any questions at all, please ask in the comments. I’m here to help.

79 Responses to Gentle Introduction to Predictive Modeling

shahzad badar September 8, 2015 at 5:37 pm #

Indeed a very clear, concise high level overview of Machine learning based predictive modeling, great read, looking forward to subsequent reads that will be focusing on model/hypothesis creation based on data, again at a high level of abstraction leaving out the statistical wizardry for advance reads

Reply
- Jason Brownlee September 8, 2015 at 7:49 pm #
  
  Thanks shahzad.
  
  Reply
Rhymeface September 8, 2015 at 10:26 pm #

So are there other ways for doing predictive modeling that do not rely on machine learning?

And additionally: all machine learning algorithms (neural networks, decision trees, SVMs, …) can be considered as part of predictive modeling?

Reply
- Jason Brownlee September 9, 2015 at 5:15 am #
  
  Good point Rhymeface.
  
  Machine Learning is the set of tools we use to create our predictive models. We don’t have to use machine learning. For example, the simplest type of prediction is to use the mean value.
  I would rephrase it as predictive modeling is the most common type of problem that we solve with machine learning (e.g. classification and regression problems).
  
  Reply
Aiswarya September 9, 2015 at 6:36 pm #

What other ways are there to donpredictive modelling?

Reply
- Jason Brownlee September 13, 2015 at 9:02 am #
  
  Sure, if we have regression dataset, I could give the mean value seen so far or the last value seen as a prediction of what to expect next.
  
  In a classification problem, we could estimate the class as the most frequent class observed.
  
  These methods are pure stats and generally uninteresting, but are examples of predictive modeling without using machine learning.
  
  Reply
- Benjamin W November 25, 2021 at 5:11 pm #
  
  Loved it! Simple enough for a beginner to understand.
  
  Reply
Paul S September 12, 2015 at 10:16 pm #

Your post was very clear and i’m excited to read more from your site. I’m new to machine learning and coding. I have very minimal experience in data visualization. But I have rudimentary knowledge in statistics. I guess at this point, i’d like to know where I should start in learning to create algorithms to learn datasets. thanks!!!

Reply
- Jason Brownlee September 13, 2015 at 5:47 am #
  
  Great to have you here Paul.
  
  I hope that I can help you on your machine learning journey.
  
  Reply
Jan September 22, 2015 at 11:45 pm #

Hi,

Which learning machines can be adopted for prediction?

I have read your folowing article
(https://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/#comment-316878)

Can I adopt all the algorithms that you mentioned ? and where are positioned the support vector machines in your list?

regards

Reply
Ramesh November 12, 2015 at 12:36 am #

Hello Jason

This is a good article. Do you have example that shows model created from training data. would we use the machine learning algorithms to create predictive model or we use algorithms after model is created with the new data.

I am new to machine learning and exploring to use it for fault detection problem.

Reply
Justin Fong September 3, 2016 at 8:58 am #

This was a great post, thanks. I am trying to become a DS and am taking IBMs Big Data University and needed this portion on what is predictive modeling cleared up.

Reply
Harshitha P K September 30, 2016 at 8:59 pm #

This a good post, thank you. If anyone could help me develop a model or an algorithm for a weighted moving window the data samples in the best possible way, it would be really very helpful.

Reply
- Jason Brownlee October 1, 2016 at 8:01 am #
  
  Thanks Harshitha.
  
  Reply
koti November 11, 2016 at 11:14 pm #

I never read such an amazing post like this! I completely understand the topic in one go. Thanks for the post!!

Reply
- Jason Brownlee November 12, 2016 at 7:20 am #
  
  Thanks koti.
  
  Reply
Stanford August 4, 2017 at 5:22 am #

This is a good post i must say. I am new to machine learning and exploring to use it for career match or match-making problem. I would like to know which algorithm and technique to use in this type of a problem?

Reply
- Jason Brownlee August 4, 2017 at 7:04 am #
  
  We must discover it through trial and error:
  https://machinelearningmastery.com/a-data-driven-approach-to-machine-learning/
  
  Reply
Chanyawat February 28, 2018 at 8:48 pm #

Could you tell me what is Machine Learning from Predictive Modeling from?

Reply
- Jason Brownlee March 1, 2018 at 6:13 am #
  
  Machine learning algorithms can be used to develop predictive models.
  
  Reply
Chioma March 1, 2018 at 12:24 am #

This is good news for me. A beginner, and this is what i need. I am still struggling digest it well based on my work — Building a morphological analyser for my language. Needs a more clear direction. Thanks Jason you are a blessing!

Reply
- Jason Brownlee March 1, 2018 at 6:14 am #
  
  Hang in there!
  
  Reply
Cristiano April 10, 2018 at 11:08 pm #

Thank you Jason for taking the time to put together this resource, it’s been helpful and really interesting to learn from them.

Reply
- Jason Brownlee April 11, 2018 at 6:39 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Sharath May 11, 2018 at 5:50 pm #

Hello Jason,

Thank for the write up, I thoroughly enjoyed reading this. I’m itching to read more. Please guide further steps. I enjoy your lessons of ML and I’m getting my hands dirty 🙂

Cheers Jason!

Reply
- Jason Brownlee May 12, 2018 at 6:28 am #
  
  Thanks.
  
  Here’s a great place to get started:
  https://machinelearningmastery.com/start-here/#getstarted
  
  Reply
Amo May 19, 2018 at 3:40 pm #

Thanks a lot Jason. You keep these posts as simple as they can get but you don’t leave out the most important info needed to get deeper into the subject.

Reply
- Jason Brownlee May 20, 2018 at 6:35 am #
  
  Thanks.
  
  Reply
kumanan May 25, 2018 at 5:31 pm #

How can I apply these to a Beauty salon
Please give some insights

Reply
- Jason Brownlee May 26, 2018 at 5:49 am #
  
  Start by defining your problem:
  https://machinelearningmastery.com/how-to-define-your-machine-learning-problem/
  
  Reply
Mahesh May 29, 2018 at 4:42 pm #

Hi Jason – I’m slightly confused between ML Model and ML Algorithm. I tend to use these 2 words interchangeably, which may be wrong. Can you please explain with an example?

Reply
- Jason Brownlee May 30, 2018 at 6:33 am #
  
  This is a common question that I answer here:
  https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-a-model-and-an-algorithm
  
  Reply
Murtadha July 3, 2018 at 3:57 pm #

I ma bright new in predictive modeling from which book I can start to understand modeling.
i have bachelor degree in engineering so I have learned the basic

Reply
- Jason Brownlee July 4, 2018 at 8:19 am #
  
  This process will help you work through new predictive modeling projects:
  https://machinelearningmastery.com/start-here/#process
  
  Reply
Vajradehi September 4, 2018 at 10:19 pm #

Hi Jason,
It is very good article I am planning to use Machine learning for mechanical assembly of different parts. Suggesting user which parts need to be picked for easy assembly.

Can you please share your experience in sequencing. I want to use combination of sequencing with predictive algorithm. Am I thinking in correct direction?

Regards,

Vajradehi Yadav

Reply
- Jason Brownlee September 5, 2018 at 6:39 am #
  
  Perhaps here is a good place to start:
  https://machinelearningmastery.com/start-here/#lstm
  
  Reply
Giang Nguyen October 3, 2018 at 8:12 pm #

Hi Jason,

As far as i know, there are two kind of applications that machine learning can be helpful: regression and classification. So how can I use machine learning for regression (not linear regression).

Thank you very much!!

Reply
- Jason Brownlee October 4, 2018 at 6:15 am #
  
  Sure.
  
  Reply
Izoduwa October 6, 2018 at 11:42 am #

which of your book contain this explanation

Reply
- Jason Brownlee October 6, 2018 at 11:44 am #
  
  I don’t have a book on the absolute basic concepts of machine learning. I focus on teaching how to “do” machine learning.
  
  Reply
Krishna October 16, 2018 at 6:07 pm #

Excellent article Jason.

Please write one article on deploying Machine Learning models in Production

Reply
- Jason Brownlee October 17, 2018 at 6:46 am #
  
  Perhaps this will help:
  https://machinelearningmastery.com/deploy-machine-learning-model-to-production/
  
  Reply
Narendra November 1, 2018 at 6:17 pm #

Hi Sir Jason Brownlee after reading your blogs I moved my self into an exactly correct direction to do expertise in ML.
Thank you so much

Reply
- Jason Brownlee November 2, 2018 at 5:46 am #
  
  Well done!
  
  Reply
Asad Khan November 29, 2018 at 12:44 am #

Dear Jason after reading your books and articles I became expert in machine learning. Kindly write some about big data analytics.

Best Regards:
Asad Khan

Reply
- Jason Brownlee November 29, 2018 at 7:43 am #
  
  Thanks for the suggestion.
  
  Reply
Don Arias January 9, 2019 at 7:40 pm #

Thanks for the clear and limpid introduction like the water of the rock God bless you

Reply
- Jason Brownlee January 10, 2019 at 7:48 am #
  
  I’m glad it helped.
  
  Reply
Suraj May 23, 2019 at 7:14 am #

Dear Jason Bro.
Since few weeks I am following your each post… and thanks would not be enough…. you are simply amezing…

Could you share the list of black box model (especially predictive model), please?

Reply
- Jason Brownlee May 23, 2019 at 2:29 pm #
  
  THanks.
  
  Yes, start here:
  https://machinelearningmastery.com/start-here/#algorithms
  
  Reply
Suraj May 23, 2019 at 7:15 am #

My another question
Is IoT predictive model?

Reply
- Jason Brownlee May 23, 2019 at 2:30 pm #
  
  IoT means internet of things, it is not a model, it is a network of devises.
  
  Learn more here:
  https://en.wikipedia.org/wiki/Internet_of_things
  
  Reply
nicholas heimpel June 13, 2019 at 7:14 am #

Hi Jason, thanks for the article!

I have two questions for you.

1) How does what is being referred to in this article differ from a more classical approach to statistics (e.g. logistic regression)? In my field we collect a sample, apply statistics to the data, and draw conclusions from the data. To me this seems the same as steps 1, 2, and 3, respectively.

2) Is there a way to make the process you described recursive? In other words, with repeated sampling, dynamically adjust predictions in a sort of bayesian fashion? I am imagining a prediction that is repeatedly adjusted and improved from exposure to new data, eventually approaching the “true” parameter. I am interested in doing some of this kind of modeling, so any suggestions for python libraries or ML techniques are welcome!

Best,
Nic

Reply
- Jason Brownlee June 13, 2019 at 2:29 pm #
  
  Great questions.
  
  There is a lot of overlap. The main difference in applied machine learning is the shift in focus away from an descriptive model towards a predictive model. E.g. predictive skill at the expense of interpretability or result-first (ml) rather than model-first (stats).
  
  A good example is in stats we start with the idea of using a linear regression or a logistic regression then beat the data into shape to meet the expectations/requirements of our pre-chosen model. In ML, we don’t care so much about what the model is, only in what works best.
  
  Sure, models can be re-fit and reevaluated as new observations or new re-sampling of the data are performed. This may give you some ideas:
  https://machinelearningmastery.com/spot-check-machine-learning-algorithms-in-python/
  
  Reply
abdi July 7, 2019 at 5:59 pm #

really help full info I have ever seen thank our respect Jason.

I have 6 month mobile network historical data so I need to predict those time series data
by using nonlinear autoregressive techniques but I’m confused to extract dataset training and test data . pls support any one my simulation Mat lab

Reply
- Jason Brownlee July 8, 2019 at 8:39 am #
  
  I don’t have matlab examples, but you can find Python examples here:
  https://machinelearningmastery.com/start-here/#deep_learning_time_series
  
  Reply
sandipan sarkar July 19, 2019 at 4:14 am #

Hello json,
I have finished data science course from “JIGSAW ACADEMY BASED IN BANGALORE(INDIA)” but I still had doubts about the meaning of “Predictive modelling”.After reading this article I had developed a new source of inspiration.
This is the most magical line which explains everything”Your action step is to think through the three aspects (data, model, predictions) and relate them to a problem that you would like to work on.”p

Best Regards
Sandipan Sarkar

Reply
- Jason Brownlee July 19, 2019 at 9:23 am #
  
  Thanks!
  
  Reply
Cherinet Mores August 4, 2019 at 9:54 pm #

Jason Brownlee J, Thank you for helping the Young developers. We appreciate your effort towards to help the people who interested. By Now, I have one questions. Would try to solve this “How to Apply genetic algorithm to the learning phase of
a neural network using Backpropagation”?? Thank you very much.

Reply
- Jason Brownlee August 5, 2019 at 6:52 am #
  
  I don’t have an example of using a GA to find neural network weights, sorry.
  
  I hope to have an example in the future.
  
  Reply
Cherinet Mores August 4, 2019 at 9:56 pm #

It to solve the Regression Problem Like House Price

Reply
Toshi August 9, 2019 at 10:36 pm #

I think there is a typo at “We don’t need to keen the training data”.

Fantastic website! I’m learning a lot, thanks!

Reply
- Jason Brownlee August 10, 2019 at 7:17 am #
  
  Thanks, fixed!
  
  Reply
Zia September 14, 2019 at 12:58 pm #

thank for such a good explianation of predictive Modeling, can you give such a code link

Reply
- Jason Brownlee September 15, 2019 at 6:16 am #
  
  Thanks.
  
  Here’s an example with code:
  https://machinelearningmastery.com/machine-learning-in-python-step-by-step/
  
  Reply
jhon September 23, 2019 at 12:33 pm #

si tuviera una sola entrada de datos y no tengo inputs, k técnica debo aplicar y el código en Python para predecir, ya que solo tengo una sola variable en el tiempo y a partir de ella deseo predecir…

Reply
- Jason Brownlee September 23, 2019 at 1:46 pm #
  
  Perhaps this will help you to frame your problem:
  https://machinelearningmastery.com/how-to-define-your-machine-learning-problem/
  
  And this to help you make a prediction:
  https://machinelearningmastery.com/make-predictions-scikit-learn/
  
  Reply
mukul kumar February 9, 2020 at 3:32 am #

sir i have no credit card, how i can purchase your book plz suggest

Reply
- Jason Brownlee February 9, 2020 at 6:26 am #
  
  I also support PayPal.
  
  Reply
Guy Mak May 26, 2020 at 1:32 pm #

This is a well explained “basic” concept of predictive modelling. I salute you!!!

Reply
- Jason Brownlee May 27, 2020 at 7:40 am #
  
  Thanks!
  
  Reply
Ariya Watthanakarnkitikun February 8, 2021 at 2:06 am #

Thank you.

Reply
- Jason Brownlee February 8, 2021 at 7:03 am #
  
  You’re welcome!
  
  Reply
Shiv Malhotra April 22, 2021 at 10:24 pm #

Nice explanation of Predictive modelling

Reply
- Jason Brownlee April 23, 2021 at 5:02 am #
  
  Thanks.
  
  Reply
Akash V B December 28, 2023 at 11:33 pm #

Good explaination it will helpfull for our study.
Thank you sir.

Reply
- James Carmichael December 29, 2023 at 10:10 am #
  
  Hi Akash…You are very welcome! We appreciate your support!
  
  Reply
Raakhi May 17, 2024 at 3:22 pm #

Very Crisp and to the point explanation. Thanks a lot !!

Reply
- James Carmichael May 18, 2024 at 7:35 am #
  
  Thank you for your feedback Raakhi! We appreciate your support!
  
  Reply

Navigation

Gentle Introduction to Predictive Modeling

1. Sample Data

2. Learn a Model

3. Make Predictions

Summary

Action Step

More On This Topic

79 Responses to Gentle Introduction to Predictive Modeling

Leave a Reply Click here to cancel reply.