Last Updated on June 18, 2016
There is no right way to get into machine learning. We all learn slightly different ways and have different objectives of what we want to do with or for machine learning.
A common goal is to get productive with machine learning quickly. If that is your goal then this post highlights five common mistakes programmers make on the path to quickly being productive machine learning practitioners.
1. Put Machine Learning on a pedestal
Machine learning is just another bag of techniques that you can use to create solutions to complex problems.
Because it is a burgeoning field, machine learning is typically communicated in academic publications and textbooks for postgraduate students. This gives it the appearance that it is elite and impenetrable.
A mindset shift is required to be effective at machine learning, from technology to process, from precision to “good enough”, but the same could be said for other complex methods that programmers are interested in adopting.
2. Write Machine Learning Code
Starting in machine learning by writing code can make things difficult because it means that you are solving at least two problems rather than one: how a technique works so that you can implement it and how to apply the technique to a given problem.
It is much easier to work on one problem at a time and leverage machine learning and statistical environments and libraries of algorithms to learn how to apply a technique to a problem. This allows you to spot check and tune a variety of algorithms relatively quickly and tune the one or two that look promising rather than investing large amounts of time interpreting ambiguous research papers containing algorithm descriptions.
Implementing an algorithm can be treated as a separate project to be completed at a later time, such as for a learning exercise or if the prototype system needs to me put into operations. Learn one thing at a time, I recommend starting with a GUI based machine learning framework whether you’re a programmer or not.
3. Doing Things Manually
A process surrounds applied machine learning including problem definition, data preparation and presentation of results, among other tasks. These processes along with the testing and tuning of algorithms can and should be automated.
Automation is a big part of modern software development for builds, tests and deployment. There is great advantage in scripting data preparation, algorithm testing and tuning and the preparation of results in order to gain the benefits of rigor and speed of improvement. Remember and reuse the lessons learned in professional software development.
The failure to start with automation (such as Makefiles or similar build system) is likely due to the fact that many programmers come to machine learning from books and courses that have less focus on the applied nature of the field. In fact, brining automation to applied machine learning is a huge opportunity for programmers.
4. Reinvent Solutions to Common Problems
Hundreds and thousands of people have likely implemented the algorithm you are implementing before you or have solved a problem type similar to the problem you are solving, exploit their lessons learned.
There is a wealth of knowledge out there of solving applied machine learning. Granted much of it may be tied up in books and research publications, but you can access it. Do your homework and search Google, Google Books, Google Scholar and reach out to the machine learning community.
If you are implementing an algorithm:
- Do you have to implement it? Can you reuse an existing open source algorithm implementation in a library or tool?
- Do you have to implement from scratch? Can you code review, learn from or port an existing open source implementation?
- Do you have to interpret the canonical algorithm description? Are there algorithm descriptions in other books, papers, theses, or blog posts that you can review and learn from?
If you are addressing a problem:
- Do you have to test all algorithms on the problem? Can you exploit studies on this or similar problem instances of the same general type that suggest algorithms and algorithm classes that perform well?
- Do you have to collect your own data? Are their publicly available data sets or APIs that you can use directly or as a proxy for your problem to quickly learn which methods are likely to perform well?
- Do you have to optimize the parameters of the algorithm? Are the heuristics you can use for configuring the algorithm presented in papers or studies of the algorithm?
What would be your strategy if you have a problem with a programming library or a specific type of data structure? Use the same tactics in the field of machine learning. Reach out to the community and ask for resources that you may be able to exploit to accelerate your learning and progress on your project. Consider forums and Q&A sites to start with and contact academics and specialists as the next step.
5. Ignoring the Math
You do not need the mathematical theory to get started, but maths is a big part of machine learning. The reason for this is it provides perhaps the most efficient and unambiguous way to describe problems and the behaviors of systems.
Ignoring the mathematical treatments of algorithms can lead to problems such as having a limited understanding of a method or adopting a limited interpretation of an algorithm. For example, many machine learning algorithms have an optimization at their core that is incrementally updated. Knowing about the nature of the optimization being solved (is the function convex) allows you to use efficient optimization algorithms that exploit this knowledge.
Internalizing the mathematical treatment of algorithms is slow and comes with mastery. Particularly if you are implementing advanced algorithms from scratch including the internal optimization algorithms, take the time to learn the algorithm from mathematical perspective.
In this post you learned about 5 common mistakes that programmers make when getting started in machine learning. The five lessons are:
- Don’t put machine learning on a pedestal
- Don’t write machine learning code
- Don’t do things manually
- Don’t reinvent solutions to common problems
- Don’t ignore the math
UPDATE: Continue the conversation on HackerNews and DataTau.
Typo in the summary:
“Don’t put machine learning is put on a pedestal”
Did you mean Don’t ignore the math in 5?
Heck, i will try it – even if i suck at math, sounds somewhat motivatin “you do not need the full theory” 🙂
100% agree with point 3, in particular your comment that it is a great opportunity for software engineers. One difference between statistics 20 years ago and machine learning today is that it has become a software engineering problem as much as it is a stats problem – and absolutely, automation is a life-saver there, if only because it allows you to replicate results later. I have had some very unpleasant moments where I “lost” a model because it wasn’t scripted, and I was unable to replicate the steps that led to a particular result a couple of days later.
I am on the fence on point 2. On one hand I agree that implementing from scratch algorithms is a great way to waste time; on the other hand, you could make the argument that just like you need to dig a bit into the math to truly understand an algorithm, knowing how the implementation work is another angle into understanding what is really going on (and potentially easier than the math route for a software engineer). I am always a bit worried about people applying existing implementations without fully understanding what is happening. I would say, by default if there is an existing implementation and you know the algorithm well, use it, but otherwise, if you want to learn an approach, dig into the math AND the implementation – you don’t fully understand an algorithm until you’ve attempted to write it yourself 😉
“If that is your goal than this post highlights.. ” – should be “then”.
“learning by writing code can be make things difficult”
“Hundreds and thousands of people have likely implemented the algorithm you are implemented “
Thanks a lot martin, fixed. I need to take a lot more care proofing (or hire you!).
I arrived to a similar conclusions when I started working on Machine Learning algorithms a few years ago, and coming from a heavy database/development background, it felt like a monumental task to take on.
With all the available libraries and frameworks, It is surprising how much of the mathematical modelling is already abstracted for you.
I do recommend tho subscribing to a few Coursera lectures on the topic, which provides great insight into benchmarking, measurement and the whole “researchy” approach (design => build => measure => iterate) towards building a machine learning system.
Very nice write up!
Good list of points. Coming from a pure Computer Science background, I think I made the mistake of delving too deep into lets say the top 10 algorithms used in ML, their inner workings and their mechanics, rather than using the pre built libraries of ML available in various toolkits and using it to solve real life problems.
A very common and easy mistake to make!
Hi …u mentioned QA ml sites n ml forums. Can u suggest a couple?
You have some kind of typo: “Starting in machine learning by writing code can make things difficult because it means that you are solving at least two problems rather than one: how a techniques so that you can implement it and how to apply the technique to a given problem.” I think you meant to say “how a technique works”.
Thanks Jamieson, fixed.
“rather than investing large amounts of time interpreting ambiguous research papers containing algorithm descriptions”
Ha ha ha. Yes indeed. God you should read through the pretentious crap the financial world produces. Endless academic papers promising appetising profit from obscure techniques which turn out to be just that. “Academic”.
Hi, Its interesting to read the details. But i use to think about the term Machine Learning?
In this digital world, everyone works on system/computer applications.
This term Machine Learning at the outset, will it make one get confused to assembly language learning or learning about some other older/legacy machines how it works and how can it be tuned to improve performance?..
The details are important, but can come later.
Using the programming analogy, you can start with Python, then learn C then learn ASM. You do not (and should not!?!?!) start with ASM, you will likely give up in frustration, and you can make a lot more progress with Python in the same time it would take.
Does java has libraries that support machine learning?
Yes, take a look at this post:
What book do you recommend? And also which language is suitable?
Python is a great platform to start with adio. You can get started here:
Nice article for an introduction to Machine Learning.
A reference link to this article has been added here.:
Thanks, I hope it helps.
Thank you for the article. I would add that another common mistake is to wait for the perfect moment to start implementing and “playing around” with the concepts and algorithms. Many people enroll in this or that course, read thousands of books and are always shifting from one source to the other and fail to execute.
I call this “getting ready to start”. It’s just fancy procrastination.
What is the hardest part in applied Machine Learning? It seems that all you need is just a few lines of code, data preparation and result checking.
I wrote a post on this topic: