Tour of Real-World Machine Learning Problems

By Jason Brownlee on September 5, 2016 in Start Machine Learning 24

Real-world examples make the abstract description of machine learning become concrete.

In this post you will go on a tour of real world machine learning problems. You will see how machine learning can actually be used in fields like education, science, technology and medicine.

Each machine learning problem listed also includes a link to the publicly available dataset. This means that if a particular concrete machine learning problem interest you, you can download the dataset and start practicing immediately.

Real World Machine Learning
Photo by SMI Eye Tracking some rights reserved.

Most Popular Research Datasets

The next 10 machine learning problems are the most popular on the University California at Irvine Machine Learning Repository website that traditionally hosts machine learning datasets used by the machine learning research community.

Iris dataset. Given flower measurements in centimeters predict the species of iris.
Adult dataset. Given census data predict with an individual will earn more than $50,000 a year.
Wine dataset. Given a chemical analysis of wines predict the origin of the wind.
Car evaluation dataset. Given details about cars predict the the estimated safety of the car.
Breast Cancer Wisconsin dataset. Given the results of a diagnostic test on breast tissue, predict whether the mass is a tumor or not.
Abalone dataset. Given the measurements of abalone predict the age of the abalone.
Wine Quality dataset. Given various measurements of wine predict the quality of the wine.
Heart Disease dataset. Given the results of various diagnostic tests on a patient predict the amount of heart disease in the patient.
Poker Hand dataset. Given a database of poker hands predict the quality of the hand.
Human activity recognition using smart phones dataset. From smart phone movement data predict the type of activity performed by the person holding the smart phone.
Forest fires dataset. Given meteorological and other factors predict the burned area of forest fires.
Internet Advertisements dataset. Given the details of images on web pages predict whether an image is an advertisement or not.

Final World

We took a whirlwind tour of 20 real-world machine learning problems.

These are actual problems posed or investigated by science and business organizations around the world.

What’s even more exciting is that these diverse problems have publicly available datasets and are also widely studied and understood.

This means you can download the data right now and explore the problem by implementing your own model, or reproduce someone else’s from a paper or blog post.

24 Responses to Tour of Real-World Machine Learning Problems

shivaprasad October 27, 2017 at 3:02 am #

I am very much impressed by this article sir,really it helped like anything.thank you sir

Reply
- Jason Brownlee October 27, 2017 at 5:25 am #
  
  Thanks.
  
  Reply
Paul January 18, 2018 at 9:10 am #

Dear Mr. Jason,
Hundreds of thousands of students decide to take up machine learning but more than half of this number get phased out due to the sheer fear of complexity of the subject but you on the other hand did a fantastic job explaining the subject with such ease. I just wanted to extend a warm gesture of gratitude. Thanks a lot for helping me and thousands of other like me. Thank you.

Reply
- Jason Brownlee January 18, 2018 at 10:16 am #
  
  Thanks for your kind words Paul.
  
  Reply
Aimee November 28, 2018 at 11:57 am #

Hi Jason! 🙂

I’m planning on playing around with the poker data set above and was going to try it with LDA, CART and finally Gradient Boosted Decision Trees (GBDT) with XGBoost, but I’m concerned about the classification process since some hands could fit into more than one class. Ideally, you want to predict the best possible hand out of multiple possibilities so I wasn’t quite sure how this may be done. Logically, I guess, you’d somehow determine all possible classes a hand could fit in and then use the class with the greatest value as the final answer since the classes increase as the hand improves. Any suggestions on this approach? What other models would you suggest trying for multi-class classification?

Thanks! Love your books so far!!! 😀

Reply
- Jason Brownlee November 28, 2018 at 2:52 pm #
  
  Sounds like an intersting problem, sorry, I’m not familiar with it. I’m hesitant to make suggestions.
  
  Reply
Fredrick Ughimi February 13, 2019 at 6:54 pm #

Awesome! Thank you, Jason.

Reply
- Jason Brownlee February 14, 2019 at 8:41 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Santosh June 10, 2019 at 3:48 am #

Hi Jason,

Your knowledge is very vast and details over here are excellent. Thanks a lot.

I was looking on Prediction models on Application behavior to predict like when Application may crash or when it can start behaving different.

Any help on the same would be excellent.

Reply
- Jason Brownlee June 10, 2019 at 7:38 am #
  
  Perhaps try searching on scholar.google.com
  
  Reply
  - Santosh June 11, 2019 at 4:10 am #
    
    Thanks a lot. Let me search over there.
    
    Reply
    - Jason Brownlee June 11, 2019 at 8:01 am #
      
      You’re welcome.
      
      Reply
Gunasekaran September 6, 2019 at 2:05 pm #

Thanks Jason for the wonderful tip. I am from a non Computer Science background, I hear cool things about Data science so i wanted to learn machine learning. But basically i just wanted to ask you few questions.I could see lot of POC’s, research projects and sample datasets to practice machine learning but :
if i get a job as a Data scientist what level of work would i be doing?
Is it using existing libraries and come up with model or invent new algorithms ?
If the big companies have readymade drag and drop model readily available on the Cloud platforms what is the need for a data scientist there ?

Reply
- Jason Brownlee September 7, 2019 at 5:12 am #
  
  Regarding jobs/roles, this might help:
  https://machinelearningmastery.com/machine-learning-tribe/
  
  Yes, existing libraries like scikit-learn are recommended and will do all the hard work:
  https://machinelearningmastery.com/start-here/#python
  
  Models are easy, preparing the data and discovering which model is appropriate (via experimentation/prototyping) requires humans/domain knowledge/intuition/data scientists.
  
  Great questions!
  
  Reply
Santosh September 8, 2019 at 5:34 am #

Thanks Jason for all the inputs on ML. I was browsing through different study material but could not get the info like how a ML model stores the Info of a Trained Model. Is it Binary which is created post Pickle or it has its own Database where it memorize the pattern to predict on next data set?

Any study material would be helpful. Thanks once again in advance.

Reply
- Jason Brownlee September 9, 2019 at 5:07 am #
  
  Different models have a different internal representation.
  
  For example CART is a decision tree, a neural network is a set of weights, etc.
  
  The model specific representation is saved to file.
  
  Does that help?
  
  Reply
  - Santosh September 17, 2019 at 1:30 am #
    
    This helped a lot.. Thanks. Where do we get this mapping as once models are Trained and saved using Pickle it stores as a Binary file.
    
    Reply
    - Jason Brownlee September 17, 2019 at 6:32 am #
      
      If you use pickle, then the internal representation does not matter as pickle will handle the saving and loading.
      
      Reply
      - Santosh September 17, 2019 at 3:41 pm #
        
        Thanks once again for you input.
      - Jason Brownlee September 18, 2019 at 5:55 am #
        
        You’re welcome.
J Chouinard March 3, 2021 at 8:59 am #

Thank you Jason for this post. It gives motivation to look at different applications of Machine Learning before diving into it.

Reply
- Jason Brownlee March 3, 2021 at 1:54 pm #
  
  You’re welcome.
  
  Reply
Suganya February 26, 2022 at 2:37 am #

Hello Mr.Jason, Thanks a lot for sharing your intelligence with us. God will bless you for your good work.

Reply
- James Carmichael February 26, 2022 at 12:30 pm #
  
  Thank you for the feedback Suganya!
  
  Reply

Navigation

Tour of Real-World Machine Learning Problems

Most Popular Kaggle Datasets

Most Popular Research Datasets

Final World

More On This Topic

24 Responses to Tour of Real-World Machine Learning Problems

Leave a Reply Click here to cancel reply.