Last Updated on October 3, 2016
The 5 Most Common Mistakes That Beginners Make
And How To Avoid Them.
I help beginners get started in machine learning.
But I see the same mistakes in both mindset and action again and again.
In this post, you will discover the 5 most common ways that I see beginners slip-up when getting started in machine learning.
I firmly believe that
anyone can get started and do really well
with applied machine learning.
Hopefully, you can identify yourself in one or more of the traps below and take some corrective action to get back on course.
Let’s get started.
1) Don’t Start With Theory
The traditional approach to teach machine learning is bottom up.
- Work hard to learn the background in math.
- Work hard to learn the theory of machine learning.
- Work hard to implement algorithms from scratch
- ??? (insert magic here)
- Finally start using machine learning (your goal!).
This approach is slow. It’s hard. It’s designed for academics that want to extend the state of the art.
It is not designed for practitioners that want a result.
You know you are caught in this trap if you think or say things like:
- I need to complete this course in linear algebra first.
- I need to go back and get a Ph.D. first.
- I have to read this textbook first.
The Way Out
How does learning 4 years of math or esoteric algorithm theory get you to where you want to be?
You are more likely to stop. To fail. To not get any closer to your goal.
The solution is to flip the model.
If the valuable contribution of machine learning to the market is the set of accurate predictions, then learn how to model problems and make accurate predictions. Start here.
Then get really damn good at it.
Read, steal, harness the theory if you need it, but only in service of your goal. Only if it makes you better at delivering value.
2) Don’t Study All of Machine Learning
Machine learning is a very large field of study.
It is the automation of learning processes with computers and has deep overlap with Artificial Intelligence.
From esoteric learning theory to robotics. The field is massive.
The field is way too big for you to take on all of it.
You know you’ve succumbed to this trap if you think things like:
- I need to learn about each new technique mentioned on a new site.
- I need to learn about computer vision, natural language processing, speech, etc. first.
- I need to know everything about everything.
The Way Out
Pick one small corner and focus on it.
Then narrow it down again.
The most valuable area of machine learning is predictive modeling. Creating models from data to make predictions.
Next, focus on a type of predictive modeling that is most relevant or interesting to you.
Then stick with it.
Maybe you choose by technique, such as deep learning. Or maybe you choose by problem type, like recommender systems.
Maybe you’re not sure, so just pick one anyway. Get good or at the very least proficient.
Then, later, circle back to another area.
3) Don’t Fiddle Around With Algorithms
Machine learning is really about the algorithms.
There are a lot of algorithms. And each algorithm is a complex system and it’s own little field of study. It’s own ecosystem.
You can lose yourself in an algorithm. And people do.
They’re called academics.
You’re in this trap if you find yourself saying:
- I need to know why it works before I use it.
- I need to deeply understand the hyperparameters first.
- I need to explain the cause and effect when tuning.
The Way Out
Algorithms are not results. They are a means to a result.
In fact, machine learning algorithms are a commodity.
Swap them out. Try out a ton of them on your problem. Tune them some, but move on.
You can learn more about algorithms to get a better result, but know when to stop.
Use a systematic process. Design tuning experiments and automate their execution and analysis.
Machine learning is all about good use of algorithms but applied machine learning is not just fiddling with algorithms.
Focus on the goal of delivering a result from each project, that is a set of predictions or a model that can make them.
4) Don’t Implement Everything From Scratch
You can learn a lot from implementing algorithms from scratch.
Sometimes you even need to implement a technique, because there is no suitable or available implementation.
But, generally, you don’t have to and you shouldn’t.
Your implementation will probably suck. Sorry.
- It will have bugs.
- It will be slow.
- It will be a memory hog.
- It will not deal with edge cases well.
- It might even be wrong.
You’re in this trap if:
- You’re writing code to load a CSV file (what the hell!?)
- You’re writing code for a standard algorithm like linear regression.
- You’re writing code for cross-validation or hyperparameter tuning.
The Way Out
- Use a general-purpose library used by tens or hundreds of thousands of other developers that handles all the edge cases and is known to be correct.
- Use a highly optimized library that squeezes every last cycle and every last byte of memory from your hardware.
- Use a graphical user interface for your own projects and avoid code altogether.
Implementing everything every time you want to use it is a very slow way to get started in machine learning.
If you’re implementing for learning, then be honest with yourself and separate that from learning how to deliver value with applied machine learning.
5) Don’t Change Tools All The Time
There are a lot of great machine learning tools.
In fact, great tools, along with data availability and fast hardware is why we are seeing a Renaissance in machine learning.
But you can fall into the trap of jumping to each new tool you stumble across.
You’re in this trap if you find yourself:
- Using each new tool you hear about.
- Find yourself learning a new tool or language every week or month.
- Get half-way through learning a library and leaving it behind for a new one.
The Way Out
Learn and use new tools.
But be strategic.
Integrate new tools into your systematic process for working through machine learning problems.
You’ll be a lot more efficient in working through problems if you pick one of the large major platforms and stick with it, at least until you are good or proficient with it.
The top 3 platforms I recommend are:
There are others and there are more specialty tools if that is your area.
Follow-through is the difference between a hobbyist and a professional.
In this post, you discover the 5 most common mistakes that I see made by beginners in machine learning.
Again, they were:
- Don’t Start With Theory.
- Don’t Study All of Machine Learning.
- Don’t Fiddle Around With Algorithms.
- Don’t Implement Everything From Scratch.
- Don’t Change Tools All The Time.
Have you fallen into any of these traps?
Do you need help getting out?
Leave a comment, I’m here to help.