Last Updated on
Where does theory fit into a top-down approach to studying machine learning?
In the traditional approach to teaching machine learning, theory comes first requiring an extensive background in mathematics to be able to understand it. In my approach to teaching machine learning, I start with teaching you how to work problems end-to-end and deliver results.
So where does the theory fit?
In this post you will discover what we really mean when we talk about “theory” in machine learning. Hint: it’s all about the algorithms.
You will discover that once you get skilled at working through problems and delivering results, you will develop a compulsion to dive deeper in order to better understanding and results. Nobody will be able to hold you back.
Finally, you will discover 5 techniques that you can use when you are practicing machine learning on standard datasets to incrementally build up your understanding of machine learning algorithms.
Discover how machine learning algorithms work including kNN, decision trees, naive bayes, SVM, ensembles and much more in my new book, with 22 tutorials and examples in excel.
Learn Theory Last, Not First
The way machine learning is taught to developers is crap.
It is taught bottom-up. This is crap if you are a developer who is primarily interested in using machine learning as a tool to solve problems rather than being a researcher in the field.
The traditional approach requires that you learn all of the prerequisite mathematics like linear algebra, probability and statistics before learning the theory of algorithms. You’re lucky if you ever go near a working implementation of an algorithm or discuss how to work a problem end-to-end and deliver a working, reliable and accurate predictive model.
I teach a top-down approach to learning machine learning. In this approach we start with 1) learning a systematic process for working through problems end-to-end, 2) map the process onto “best of breed” machine learning tools and platforms then 3) complete targeted practice on test datasets.
You can learn more about my approach to teaching top-down machine learning in the post “Machine Learning for Programmers: Leap from developer to machine learning practitioner“.
So where does theory fit into this process?
If the model is flipped, then theory is taught later. But what theory are we talking about and how exactly do you learn that theory when you are practicing on test datasets?
Get your FREE Algorithms Mind Map
I've created a handy mind map of 60+ algorithms organized by type.
Download it, print it and use it.
Also get exclusive access to the machine learning algorithms email mini-course.
The Theory is Really All About Algorithms
The field of machine learning is theory-dense.
It’s dense because there is a tradition to describe and explain concepts mathematically.
This is useful because mathematical descriptions can be very concise, cutting down on the ambiguity. They also lend themselves to analysis by leveraging the techniques from the context in which they are described (e.g. a probabilistic understanding of a process).
A lot of these tangential mathematical techniques are often bundled in with the description of machine learning algorithms. For someone who just wants to build a superficial understanding of a method to be able to configure and apply it, this feels overwhelming. Frustratingly so.
It is frustrating if you do not have the grounding to be able to parse and understand the description of an algorithm. It’s frustrating because coming from a field like computer science, algorithms are described all the time, but the difference is the descriptions are intended for fast comprehension (e.g. for desk checking) and implementation.
We know that for example when learning what a hash table is and how to use it, that we almost never need to know the specifics of the hashing function in our day-to-day. But we also know what a hashing function is and where to go to learn more about hashing function specifics and how to write your own. Why can’t machine learning work like that?
The bulk of the “theory” one encounters in machine learning is related to machine learning algorithms. If you ask any beginner about why they are frustrated with the theory, you will learn that it is in relation to learning how to understand or use a specific machine learning algorithm.
Here, algorithms is more broad than a process for creating a predictive model. It also refers to algorithms for selecting features, engineering new features, transforming data and estimating the accuracy of a model on unseen data (e.g. cross validation).
So, learning theory last, really means learning about machine learning algorithms.
A Compulsion To Dive Into Theory
I generally advise targeted practice on well known machine learning datasets.
This is because well known machine learning dataset, like those on the UCI Machine Learning Repository are easy to work with. They are small so they fit into memory and can be processed on your workstation. They are also well studied and understood so you have a baseline for comparison.
You can learn more about targeted practice of machine learning datasets in the post “Practice Machine Learning with Small In-Memory Datasets from the UCI Machine Learning Repository“.
Understanding machine learning algorithms fits into this process. The reason is in the pursuit of getting results on standard machine learning algorithms you are going to run into limitations. You are going to want to know how to get more out of a given algorithm or to know more about how to best configure it, or how it actually works.
This need to know more and curiosity will drive you into studying the theory of machine learning algorithms. You will be compelled to piece together an understand of the algorithms in order to achieve better results.
We see this same effect in young developers from varied backgrounds that end up eventually studying the code of open source projects, textbooks and even research papers in order to hone their craft. The need to being a better more capable programmer drives them to it.
If you are curious and motivated to succeed, you cannot resist studying the theory.
5 Techniques To Understand Machine Learning Algorithms
The time will come to dive into machine learning algorithms as part of your targeted practice
When that time comes, there are a number of techniques and template that you can use to short cut the process.
In this section you will discover 5 techniques that you can use to understand the theory of machine learning algorithms, fast.
1) Create Lists of Machine Learning Algorithms
When you are just starting out you may feel overwhelmed by the larger number of algorithms available.
Even when spot testing algorithms, you may be unsure of which algorithms to include in your mix (hint, be diverse).
An excellent trick you can use when starting out is to keep track of the algorithms you read about. These lists can be as simple as the name of the algorithm, and can increase in complexity as you interest and curiosity build.
Capture details like the problem type to which they are suited (classification or regression), related algorithms, and taxonomic class (decision tree, kernel, etc.). When you see the name of an algorithm that is new to you, add it to your list. When you start a new problem, try some algorithms you have never used before. Mark a check next to algorithms you have used before. And so on.
Controlling the names of algorithms in lists gives you power. This ridiculously simple tactic can help you get on top of the overwhelm. Examples of where your simple algorithm lists can save you a lot of time and frustration are:
- Ideas of algorithms to try on new and different problem types (time series, rating systems, etc.)
- Algorithms that you can investigate to learn more about how to apply.
- Get a handle on algorithm types by category (trees, kernels, etc.).
- Avoid the problem of fixating on a favorite algorithm.
Start by creating lists of algorithms, open a spreadsheet and get started.
See the post “Take Control By Creating Targeted Lists of Machine Learning Algorithms” for more information on this tactic.
2) Research Machine Learning Algorithms
When you want to know more about a machine learning algorithm you need to research it.
The main reasons you will be interested to research an algorithm is to learn how to configure it and to learn how it works.
Research is not just for academics. A few simple tips can take you a long way in gathering information on a given machine learning algorithm.
The key is diversity of information sources. The following is a short list of the types of sources you can consult for information on an algorithm you are researching.
- Authoritative sources like textbooks, lecture notes, slide and overview papers.
- Seminal sources like the papers and articles in which the algorithm was first described.
- Leading-edge sources that describe state-of-the-art extensions and experiments on the algorithm.
- Heuristic sources like those that come out of machine learning competitions, posts on Q&A websites and conference papers.
- Implementation sources such as open source code for tools and libraries, blog posts and technical reports.
You do not need to be a PhD researcher nor a machine learning algorithm expert.
Take your time and pick over many sources collecting facts on a machine learning algorithm you are trying to figure out. Focus on the practical details you can apply or understand and leave the rest.
For more information on researching machine learning algorithms see the post “How to Research a Machine Learning Algorithm“.
3) Create Your Own Algorithm Descriptions
Machine learning algorithm descriptions you will discover in your research will be incomplete and inconsistent.
An approach that you can use is to put together your own mini algorithm descriptions. This is another very simple and very powerful tactic.
You can design a standard algorithm description template with only those details that are useful to you in getting the most from algorithms, like algorithm usage heuristics, pseudo-code listings, parameter ranges and resource lists.
You can then use the same algorithm description template across a number of key algorithms and start to build up your own little algorithm encyclopedia that you can refer to on future projects.
Some questions you might like to use in your own algorithm description template include:
- What are the standard abbreviations used for the algorithm?
- What is the objective or goal for the algorithm?
- What is the pseudo-code or flowchart description of the algorithm?
- What are the heuristics or rules of thumb for using the algorithm?
- What are useful resources for learning more about the algorithm?
You will be surprised at how useful and practical these descriptions can be. For example, I used this approach to write a book of nature-inspired algorithm descriptions that I still refer back to years later.
For more on how to create effective algorithm description templates, see the post “How to Learn a Machine Learning Algorithm“.
For more information on my book of algorithms described using a standard algorithm description template, see “Clever Algorithms: Nature-Inspired Programming Recipes“.
4) Investigate Algorithm Behavior
Machine learning algorithms are complex systems that are sometimes best understood by their behaviors on actual datasets.
By designing small experiments on machine learning algorithms using small datasets you can learn a lot about how an algorithm works, it’s limitations and how to configure it in ways that may transfer to exceptional results on other problems.
A simple procedure that you can use to investigate a machine learning algorithm is as follows:
- Select an algorithm that you would like to know more about (e.g. random forests).
- Identify a question about that algorithm you would like answered (e.g. the effect of the number of trees).
- Design an experiment to find an answer to that question (e.g. try different numbers of trees on a few binary classification problems and chart the relationship with classification accuracy).
- Execute the experiment and write-up your results so that you can make use of them in the future.
- Repeat the process.
This is one of the truly exciting aspects of applied machine learning, that through your own simple investigations you can achieve surprising and state of the art results.
For more information on how to study algorithms from their behavior, see the post “How To Investigate Machine Learning Algorithm Behavior“.
5) Implement Machine Learning Algorithms
You cannot get more intimate with a machine learning algorithm than by implementing it.
In implementing a machine learning algorithm from scratch you will be confronted with the myriad of micro-decisions that go into a given implementation. You may decide to cover some up with rules of thumb of expose them all as parameters to the user.
Below is a repeatable process that you can use to implement machine learning algorithms from scratch.
- Select a programming language, one that you are most familiar with is probably best.
- Select an algorithm to implement, start with something easy (see below for a list).
- Select a problem to test your implementation on as you develop, 2D data is good for visualizing (even in Excel).
- Research the algorithm and leverage many and diverse sources of information (e.g. read tutorials, papers, other implementations, and so on).
- Unit test the algorithm to confirm your understanding and validate the implementation.
Start small and build confidence.
For example 3 algorithms that you select as your first machine learning algorithm implementation from scratch are:
- Linear Regression using Gradient Descent
- k-Nearest Neighbor (see my tutorial in Python)
- Naive Bayes (see my tutorial in Python)
For more information on how to implement machine learning algorithms, see the post “How to Implement a Machine Learning Algorithm“.
Also see the posts:
- “Benefits of Implementing Machine Learning Algorithms From Scratch“
- “Don’t Start with Open-Source Code When Implementing Machine Learning Algorithms“
Theory is Not Just For the Mathematicians
Machine learning is not just for the mathematical elite. You can learn how machine learning algorithms work and how to get the most from them without diving deep into multivariate statistics.
You do not need to be good at math.
As we saw in the techniques section, you can start with algorithm lists and transition deeper into algorithm research, descriptions and algorithm behavior.
You can go very far with these methods without diving much at all into the math.
You do not need to be an academic researcher.
Research is not just for academics. Anyone can read books and papers and compile their own understanding of a topic like a specific machine learning algorithm.
Your biggest breakthroughs will come when you take on the persona of “the scientist” and start experimenting on machine learning algorithms as though they were complex systems in need of study. You will discover all kinds of interesting quirks in behavior that may not even be documented.
Pick one of the techniques listed above and get started.
I mean today, now.
Unsure where to start?
Here’s 5 great ideas of where you could start:
- Make a list of 10 machine algorithms for classification (take a look at my tour of algorithms to get some ideas).
- Find five books that give detailed descriptions of Random Forests.
- Create a five-slide presentation on Naive Bayes using your own algorithm description template.
- Open Weka and see how the “k” parameter affects accuracy of k-nearest neighbor on the iris flowers data set.
- Implement linear regression using stochastic gradient descent.
Did you take action? Enjoy this post? Leave a comment below.