Last Updated on June 7, 2016
In this post I lay out a concrete self-study roadmap for applied machine learning that you can use to orient yourself and figure out your next step.
I think a lot about frameworks and systematic approaches (as evidenced on my blog). I would consider this post a vast expansion of my previous thoughts on a self-study program in the post “Self-Study Guide to Machine Learning” that really hit a chord in the community.
Let’s jump in…
Machine Learning Roadmap
Machine learning is a huge field of study. There are so many algorithms, theories, techniques and classes of problems to learn about that it does feel overwhelming.
Machine learning is also deeply interdisciplinary. You can jump from material pitched to programmers, to material pitched to statisticians and it does feel frustrating when so much prior knowledge is assumed.
What is needed is a structured approach that provides a roadmap for studying the topics and levels of detail in machine learning that also integrates popular resources like books and open course.
The structured approach addresses the overwhelm by focusing attention on what you need to learn when you need to learn it. It addresses the frustration by sequencing the presentation of material with a focus on the practical side, tailored for engineers and programmers.
A roadmap lets you orient yourself in terms of where you are and where you want to be.
Self-Study is the Path
Self study means at your own pace, on your own terms and on your own schedule.
Self-study is the best way to learn machine learning. That does not mean you have to do it all on your own, far from it. It means learning things the most efficient way for you and leveraging the best courses, books and guides available on the internet.
Self-study is also compatible with more formal courses like undergraduate and post-graduate studies. It means actively integrating the material into your own knowledge base and owning that process. In owning the process, you can dive deeper into the areas that interest you the most.
Machine learning is an applied discipline, like programming. Studying the theory is important, but you must put in the time to apply the theory. You must practice. This is critical. You need to build up the intuitions for the processes, algorithms and problems.
The structured approach to studying machine learning is broken down into four levels of competence:
The four levels are delineated based on the problems they face and the learning objectives they have. In turn, each level has a different set of activities to pursue on the path towards their objective.
Problems at Each Level
Each competency level faces a different set of problems, as follows:
- Beginner: Confused about what machine learning actually is. Overwhelmed with the vast amount of information available. Frustrated at the unspecified assumed prior knowledge in most of the information that is available.
- Novice: Daunted by the mathematical descriptions of algorithms. Struggling with the application of machine learning to problems. Lost looking for problems to investigate with machine learning.
- Intermediate: Bored with the introductory material. Hungry with the need for more details and deeper insight. Aching to demonstrate and push their knowledge and skills.
- Advanced: Obsessed with getting the most from systems and solutions. Seeking opportunities for greater contribution. Inspired to push the boundaries.
Each level in the hierarchy of competencies has a singular objective and many tasks they can pursue toward that objective. Those objectives are as follows:
- Beginner: Develop a clear foundation and start journey into the field.
- Novice: Develop and practice process of applied machine learning.
- Intermediate: Develop a deeper understanding of algorithms, problems and tools.
- Advanced: Develop extensions to the field such as algorithms, problems and tools.
The objectives of each level define the types of activities to pursue for those objectives to be met. You can design your own activities (and you are strongly encouraged), although the following are the suggested activities for each level.
- Discover the “whys” of machine learning (i.e. why machine learning matters and why it matters to you).
- Identify your self-limiting beliefs that may be holding you back (i.e. no degree, not good at mathematics).
- Investigate the basal definitions and concepts for the field (i.e. machine learning problems, and algorithms).
- Study and learn the steps in the process of applied machine learning.
- Understand enough of the details of a tool or library to work through the steps of applied machine learning. (basic familiarity with tools and libraries)
- Practice the process of applied machine learning end-to-end on problems.
- Small focused investigations into algorithms, problems and tools.
- Sharpen the skills of applied machine learning through participating in and learning from competitive machine learning.
- Develop extensions to algorithms, problems and tools in a structured way.
- Engage and make contributions to the community.
How to Use
This roadmap is a useful tool that you can use in a variety of ways on your path towards machine learning mastery:
- Learning Guide: Use it as a linear guide of objectives and activities for you to complete. Patience and hard work will carry you to the advanced level in short order.
- Streamlined Guide: Use as a linear guide as above, but narrow the objectives to a specific area of machine learning that you are looking to master, rather than the broader domain of applied machine learning. This could be a specific problem or class of algorithm.
- Information Filter: The roadmap can be used to filter information and resources you come across. This is a powerful use case because you can quickly assess whether a blog post, article or book is relevant for your level in the journey.
This Path is Right For You!
I have designed this guide for other engineers and programmers.
- You may know how to program.
- You may work (or have worked) professionally as an engineer or programmer.
- You may be an undergraduate or postgraduate student.
- You are interested in machine learning or data science.
- You may be working with machine learning and data.
This approach is tailored for programmers and engineers who are already familiar with the process of developing and building out systems. They have a computational or logical approach to thinking and think in terms of systems. Programmers particularly are already familiar with the power of automation and with the complexity and characterization of algorithms.
This approach has been effective for both professional programmers as well as students studying engineering, computer science or similar disciplines.
- You do not need to be a programmer or a good programmer. You can use off-the-shelf tools like Weka that have graphical user interfaces to work through machine learning problems and apply machine learning algorithms.
- You do not need to be a mathematician or statistician. You can pick up only the statistics, probability and linear algebra that you need for a given algorithm, when it comes time to studying that algorithm.
- You can read guides, books and take open courses. They fit comfortably into the breakdown into the four competency levels. A given book may be a perfect reference for the novice or intermediate level, or may span both levels. Similarly, a course may fit neatly into a given level or may span two or more levels, giving a sample of a variety of machine learning activities.
I recommend that you focus your scope on classification and regression type problems and the relevant algorithms and tools. These are the two most common underlying machine learning problems that most other problems can be reduced to.
There are subfields of Machine Learning such as Computer Vision, Natural Language Process, Recommender Systems or Reinforcement Learning. These areas can be reduced to classification and regression problems and their study does neatly fit into the roadmap structure that is presented. I would suggest not diving into these fields until you are at the intermediate level.
I have a few pragmatic principles that might help you make you make fast and useful progress towards your goals in machine learning. They really frame the roadmap.
- Machine Learning is a journey. You need to know where you are now and where you are trying to get to. It’s going to take time and hard work, but there is plenty of help available to you.
- Create semi-formal work products. Write down what you learn and discover along the way in the form of blog posts, technical reports, and code repositories. You will quickly amass a portfolio of demonstrated skill and knowledge for you and others to reflect on.
- Just-in-time learning. Do not learn complex topics until just before you need them. For example, learn just enough probability or linear algebra to understand the algorithm you’re working on, do not take a 3 year course in stats and maths before you start machine learning.
- Leverage existing skills. If you can code, implement algorithms to understand them rather than studying the math. Use languages you know. Focus on the one thing you are learning, don’t complicate it by learning a new language, tool or library at the same time.
- Mastery is an ideal. Mastery of machine learning requires continuous learning. You can never actually reach it, you can only continue to study, learn and improve.
Below are 3 tips to effectively get the most out of this guide and your journey in machine learning:
- Start with a small project that you can complete in one hour.
- Aim to complete one project per week in order to build up and maintain your momentum and a workspace of projects that you can build upon.
- Share your results on your blog, Facebook, Google+, Github or wherever you can to demonstrate your interest, increasing skills, knowledge and to get feedback.
Take a moment and write down:
- What level do you think you are at and what are you struggling with?
- What level do you want to be at and what do you want to be able to do?