Hi, Jason here. I’m the guy behind Machine Learning Mastery.
My goal is to help you get started, make progress and kick butt with machine learning.
I teach a top-down and results-first approach designed for developers and engineers.
This is unlike most academic textbooks and university courses.
You may be feeling overwhelmed. You may have a lot of questions.
I created this page for you. It is your starting point.
Take your time. Bookmark this page. Find the answers to your questions.
Table of Contents
What do you need help with? Here are some quick links:
- How Do I Get Started?
- Applied Machine Learning Process
- Machine Learning Algorithms
- Study Machine Learning Algorithms
- Weka Machine Learning
- Python Machine Learning
- R Machine Learning
- Deep Learning
- Long Short-Term Memory (LSTM)
- Time Series Forecasting
- Frequently Asked Questions (FAQ)
- Need More Help?
How Do I Get Started?
The most common question I’m asked is: “how do I get started?”
My best advice for getting started in machine learning is broken down into a 5-step process:
- Step 1: Adjust Mindset. Believe you can practice and apply machine learning.
- Step 2: Pick a Process. Use a systemic process to work through problems.
- Step 3: Pick a Tool. Select a tool for your level and map it onto your process.
- Step 4: Practice on Datasets. Select datasets to work on and practice the process.
- Step 5: Build a Portfolio. Gather results and demonstrate your skills.
For more on this top-down approach, see:
Many of my students have used this approach to go on and do well in Kaggle competitions and get jobs as Machine Learning Engineers and Data Scientists.
Applied Machine Learning Process
The benefit of machine learning are the predictions and the models that make predictions.
To have skill at applied machine learning means knowing how to consistently and reliably deliver high-quality predictions on problem after problem. You need to follow a systematic process.
Below is a 5-step process that you can follow to consistently achieve above average results on predictive modeling problems:
- Step 1: Define your problem.
- Step 2: Prepare your data.
- Step 3: Spot-check algorithms.
- Step 4: Improve results.
- Step 5: Present results.
For a good summary of this process, see the posts:
- Applied Machine Learning Process
- How to Use a Machine Learning Checklist to Get Accurate Predictions
Machine Learning Algorithms
Machine learning is about machine learning algorithms.
You need to know what algorithms are available for a given problem, how they work, and how to get the most out of them.
Here’s how to get started with machine learning algorithms:
- Step 1: Discover the different types of machine learning algorithms.
- Step 2: Discover the foundations of machine learning algorithms.
- Step 3: Discover how top machine learning algorithms work.
You can see all machine learning algorithm posts here. Below is a selection of some of the most popular tutorials.
Study Machine Learning Algorithms
Machine learning algorithms make up a big part of applied machine learning.
There is a lot of benefit in studying machine learning algorithms and learning how to get the most out of them.
Below is a simple 5-step process that you can use to study and learn any machine learning algorithm.
- Step 1: Create lists of machine learning algorithms
- Step 2: Research machine learning algorithms
- Step 3: Create your own algorithm descriptions
- Step 4: Investigate algorithm behavior
- Step 5: Implement machine learning algorithms
For a detailed overview of this approach see the post:
Weka Machine Learning
Weka is a platform that you can use to get started in applied machine learning.
It has a graphical user interface meaning that no programming is required and it offers a suite of state of the art algorithms.
Here’s how you can get started with Weka:
- Step 1: Discover the features of the Weka platform.
- Step 2: Discover how to get around the Weka platform.
- Step 3: Discover how to deliver results with Weka.
You can see all Weka machine learning posts here. Below is a selection of some of the most popular tutorials.
Prepare Data in Weka
- How To Load CSV Machine Learning Data in Weka
- How to Better Understand Your Machine Learning Data in Weka
- How to Normalize and Standardize Your Machine Learning Data in Weka
- How To Handle Missing Values In Machine Learning Data With Weka
- How to Perform Feature Selection With Machine Learning Data in Weka
Weka Algorithm Tutorials
Python Machine Learning
Python is one of the fastest growing platforms for applied machine learning.
You can use the same tools like pandas and scikit-learn in the development and operational deployment of your model.
Below are the steps that you can use to get started with Python machine learning:
- Step 1: Discover Python for machine learning
- Step 2: Discover the ecosystem for Python machine learning.
- Step 3: Discover how to work through problems using machine learning in Python.
You can see all Python machine learning posts here. Below is a selection of some of the most popular tutorials.
Prepare Data in Python
Machine Learning in Python
- Evaluate the Performance of Machine Learning Algorithms
- Metrics To Evaluate Machine Learning Algorithms in Python
- Spot-Check Classification Machine Learning Algorithms in Python with scikit-learn
- Spot-Check Regression Machine Learning Algorithms in Python with scikit-learn
- How To Compare Machine Learning Algorithms in Python with scikit-learn
R Machine Learning
R is a platform for statistical computing and is the most popular platform among professional data scientists.
It’s popular because of the large number of techniques available, and because of excellent interfaces to these methods such as the powerful caret package.
Here’s how to get started with R machine learning:
- Step 1: Discover the R platform and why it is so popular.
- Step 2: Discover machine learning algorithms in R.
- Step 3: Discover how to work through problems using machine learning in R.
You can see all R machine learning posts here. Below is a selection of some of the most popular tutorials.
Data Preparation in R
Applied Machine Learning in R
Deep learning is a fascinating and powerful field.
State-of-the-art results are coming from the field of deep learning and it is a sub-field of machine learning that cannot be ignored.
Here’s how to get started with deep learning:
- Step 1: Discover what deep learning is all about.
- Step 2: Discover the best tools and libraries.
- Step 3: Discover how to work through problems and deliver results.
You can see all deep learning posts here. Below is a selection of some of the most popular tutorials.
- Crash Course On Multi-Layer Perceptron Neural Networks
- Crash Course in Convolutional Neural Networks for Machine Learning
- Crash Course in Recurrent Neural Networks for Deep Learning
Convolutional Neural Networks
- Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras
- Object Recognition with Convolutional Neural Networks in the Keras Deep Learning Library
- Predict Sentiment From Movie Reviews Using Deep Learning
Recurrent Neural Networks
Long Short-Term Memory (LSTM)
Long Short-Term Memory (LSTM) Recurrent Neural Networks are designed for sequence prediction problems and are a state-of-the-art deep learning technique for challenging prediction problems.
Here’s how to get started with LSTMs in Python:
- Step 1: Discover the promise of LSTMs.
- Step 2: Discover where LSTMs are useful.
- Step 3: Discover how to use LSTMs on your project.
You can see all LSTM posts here. Below is a selection of some of the most popular tutorials using LSTMs in Python with the Keras deep learning library.
Data Preparation for LSTMs
- How to Reshape Input Data for Long Short-Term Memory Networks
- How to One Hot Encode Sequence Data
- How to Remove Trends and Seasonality with a Difference Transform
- How to Scale Data for Long Short-Term Memory Networks
- How to Prepare Sequence Prediction for Truncated BPTT
- How to Handle Missing Timesteps in Sequence Prediction Problems
Modeling with LSTMs
- Generative Long Short-Term Memory Networks
- Stacked Long Short-Term Memory Networks
- Encoder-Decoder Long Short-Term Memory Networks
- CNN Long Short-Term Memory Networks
- Diagnose Overfitting and Underfitting of LSTM Models
- How to Make Predictions with Long Short-Term Memory Models
LSTM for Time Series
XGBoost is a highly optimized implementation of gradient boosted decision trees.
It is popular because it is being used by some of the best data scientists in the world to win machine learning competitions.
Here’s how to get started with XGBoost:
- Step 1: Discover the Gradient Boosting Algorithm.
- Step 2: Discover XGBoost.
- Step 3: Discover how to get good at delivering results with XGBoost.
You can see all XGBoosts posts here. Below is a selection of some of the most popular tutorials.
- How to Configure the Gradient Boosting Algorithm
- Tune Learning Rate for Gradient Boosting with XGBoost in Python
- Stochastic Gradient Boosting with XGBoost and scikit-learn in Python
- How to Tune the Number and Size of Decision Trees with XGBoost in Python
- How to Best Tune Multithreading Support for XGBoost in Python
Time Series Forecasting
Time series forecasting is an important topic in business applications.
Many datasets contain a time component, but the topic of time series is rarely covered in much depth from a machine learning perspective.
Here’s how to get started with Time Series Forecasting:
- Step 1: Discover Time Series Forecasting.
- Step 2: Discover Time Series as Supervised Learning.
- Step 3: Discover how to get good at delivering results with Time Series Forecasting.
You can see all Time Series Forecasting posts here. Below is a selection of some of the most popular tutorials.
Data Preparation Tutorials
- How to Make Baseline Predictions for Time Series Forecasting with Python
- How to Check if Time Series Data is Stationary with Python
- How to Create an ARIMA Model for Time Series Forecasting with Python
- How to Grid Search ARIMA Model Hyperparameters with Python
- How to Work Through a Time Series Forecast Project
Frequently Asked Questions (FAQ)
This section lists frequently asked questions and my best answers.
What programming language should I use for machine learning?
The specific programming language or platform does not matter.
- I strongly believe that the best thing to focus on is how to work through machine learning problems end-to-end (learn more).
- That being said, I think if you’re not a strong programmer, that Weka is the best place to start because you can work through problems without writing a line of code (learn more).
- I think Python is excellent for developing models that can run in production, it is a growing platform (learn more).
- I think R might be the most powerful platform, but it requires learning a new programming language. I think the sweet spot for R is one-off projects and R&D projects (learn more).
Also, this post might help:
If pressed to answer, I would recommend that you start with Python (learn more).
How do I get a job without a degree?
I teach a top-down and results-first approach to machine learning.
This means that you very quickly learn how to work through predictive modeling problems and deliver results.
As part of this process, I teach a method of developing a portfolio of completed projects. This demonstrates your skill and gives you a platform from which to take on ever more challenging projects.
It is this ability to deliver results and the projects that demonstrate that you can deliver results is what will get you a position.
Business only use credentials as a shortcut, they want results more than anything else.
Here’s more information on the portfolio approach:
Here’s more information on why you don’t need a degree:
Can you be my mentor or coach?
Thanks for asking.
I would love to help, but I just don’t have the capacity.
I do offer a structured and top-down approach to machine learning self-study. You can learn more about it here:
I am happy to continue to answer any machine learning questions you might have by email (one question at a time please).
Why do you focus on Python for machine learning?
I like to use different tools depending on the project.
Recently I have been focusing more attention on Python-based tools and libraries.
Recently, it seems that Python may be emerging as a dominant platform. Skills in Python for machine learning are in great demand. I am just serving this need.
See the post:
Can you answer a question about a blog post?
Please ask your question in the comments of the blog post.
Why don’t you have a post or book on […]?
I will get to it eventually I hope.
Until then, contact me and let me know about the topic you want me to cover.
What should I do if I’ve found an error?
Please contact me so that I can correct it.
What mathematical background do I need for machine learning?
I’m not your guy.
My mission is to help you get started and get good at applied machine learning using a “learn by doing” philosophy. I teach using a top-down and results-first approach.
If you want the bottom-up theory-first approach to machine learning, I would recommend a textbook or a multi-year graduate program. It is a path to theoretical machine learning and academia.
That being said, it is generally recognized that eventually, you will need to know your away around the intersection of these fields:
- Linear Algebra
Don’t make the beginners mistake of thinking you need to start here. Read this:
Also, read this post:
What is your background?
I have a bunch of degrees in computer science and artificial intelligence and I have worked many years in the tech industry in teams where your code has to work and be maintainable.
Academia was a bad fit for me, but I loved to research and to write. I think of myself as a good engineer that really wants to help other people get started and get good at machine learning, without wasting years of their life “getting ready to get started“.
What machine learning project should I work on?
My best advice is to work on problems or with technology that most interests you.
Here are some more specific suggestions:
Consider working through a standard machine learning dataset:
- Practice Machine Learning with Datasets from the UCI Machine Learning Repository
- 7 Time Series Datasets for Machine Learning
Consider working on some more advanced datasets from competitive machine learning:
- Tour of Real-World Machine Learning Problems
- 10 Challenging Machine Learning Time Series Forecasting Problems
Consider working on problems that matter to you:
Consider devising your own projects:
What research topic should I work on?
I don’t know.
My focus is on industrial machine learning.
I think the best person to talk to about research topics is your research advisor.
Best of luck with your project.
Can I use your code in my own project?
Yes, but understand that it was developed and provided for educational purposes.
I take no responsibility for the code, what it might do or how you might use it.
Can you consult on our project?
Sorry, I no longer have the capacity or interest.
It takes too much time and mental space away from writing new tutorials and books, which is the most effective way I can help the most people.
Can you recommend someone?
Sorry, I want to stay away from the whole recruiting thing.
From what I have experienced, the great people are always fully engaged. Your job is to find them and offer them something more interesting.
Can you help me with my project?
Sorry, I don’t have the capacity to get involved in your project at the level you need or at a level to do a good job.
I am happy to answer any specific questions you have about machine learning.
Can you read, review, collaborate or help with my research paper?
I no longer consider myself an academic. I cannot give you expert academic advice.
I am happy to answer any specific questions you have about machine learning.
What school, university, or course should I take?
I don’t know.
My focus is industrial machine learning, I no longer have an opinion on schools and courses.
If it looks good to you, go for it.
Remember, you do not need a higher degree to do very well in applied machine learning.
Can you read, review or debug my code?
I’m eager to help, but I just don’t have the capacity to debug code for you.
I am happy to make some suggestions:
- Consider aggressively cutting the code back to the minimum required. This will help you isolate the problem and focus on it.
- Consider cutting the problem back to just one or a few simple examples.
- Consider finding other similar code examples that do work and slowly modify them to meet your needs. This might expose your misstep.
- Consider posting your question and code to StackOverflow.
Can you help me setup or debug my development environment?
I’m eager to help, but I just don’t have the capacity.
My material is generally intended for those that know their way around their own workstation and know how to install software.
Check these tutorials for setting up your environment:
- How to Setup a Python Environment for Machine Learning and Deep Learning with Anaconda
- How to Install a Python 3 Environment on Mac OS X for Machine Learning and Deep Learning
- How to Create a Linux Virtual Machine For Machine Learning Development With Python 3
Consider posting your question and issue to StackOverflow.
Need More Help?
I’m here to help you become awesome at applied machine learning.
If you still have questions and need help, you have some options:
- Ebooks: I sell a catalog of Ebooks that show you how to get results with machine learning, fast.
- Blog: I write a lot about applied machine learning on the blog, try the search feature.
- Contact: You can contact me with your question, but one question at a time please.