Last Updated on
Machine Learning can be difficult to understand when getting started. There are a lot of algorithms and processes that are prescribed and used, many with difficult to penetrate explanations for how and why the work.
It can feel overwhelming.
An approach that you can use to get handle on machine learning algorithms and practices is to implement them from scratch. This will give you a deep understanding of how the algorithm works and all of the micro decision points within the method that can be parameterized or modified to tune it to a specific problem.
In this post you will discover the benefits and limitations of implementing machine learning algorithms from scratch and how you can accelerate this process by completing algorithm tutorials.
Discover how machine learning algorithms work including kNN, decision trees, naive bayes, SVM, ensembles and much more in my new book, with 22 tutorials and examples in excel.
Implement Machine Learning Algorithms
Implementing machine learning algorithms from scratch can give you a deep understanding of the algorithm and a sense of confidence and ownership that are difficult to achieve by just applying the method.
The benefits of implementing algorithms from scratch are:
- Understanding: You will gain a deep appreciate for how the algorithm works. You understand how the mathematical description of the method relates to vectors and matrices of numbers that you code operates on. You will also know how all of the parameters are used, their effects and even have insights into how it could be further parameterized to specialize it for a problem.
- Starting Point: Your implementation will provide the basis for more advanced extensions and even an operational system that uses the algorithm. Your deep knowledge of the algorithm and you implementation can give you advantages of knowing the space and time complexity of your own code over using an opaque off-the-shelf library.
- Ownership: The implementation is your own giving you confidence with the method and ownership over how it is realized as a system. It is no longer just a machine learning algorithm, but a method that is now in your toolbox.
Get your FREE Algorithms Mind Map
I've created a handy mind map of 60+ algorithms organized by type.
Download it, print it and use it.
Also get exclusive access to the machine learning algorithms email mini-course.
Once you have implemented an algorithm you can explore making improvements to the implementation. Some examples of improvements you could explore include:
- Experimentation: You can expose many of the micro-decisions you made in the algorithms implementation as parameters and perform studies on variations of those parameters. This can lead to new insights and disambiguation of algorithm implementations that you can share and promote.
- Optimization: You can explore opportunities to make the implementation more efficient by using tools, libraries, different languages, different data structures, patterns and internal algorithms. Knowledge you have of algorithms and data structures for classical computer science can be very beneficial in this type of work.
- Specialization: You may explore ways of making the algorithm more specific to a problem. This can be required when creating production systems and is a valuable skill. Making an algorithm more problem specific can also lead to increases in efficiency (such as running time) and efficacy (such as accuracy or other performance measures).
- Generalization: Opportunities can be created by making a specific algorithm more general. Programmers (like mathematicians) are uniquely skilled in abstraction and you may be able to see how the algorithm could be applied to more general cases of a class of problem or other problems entirely.
Limitations of Implementing Algorithms
Implementing algorithms from scratch is an approach we have discussed before. It is one of the project types in my Small Projects Methodology. In this project type, I suggest that you perform your own literature survey and investigate how the algorithm works first, before implementing it.
This further lead to the algorithm description template, that provided you a tool on how to describe a machine learning algorithm effectively so that you deeply understand it.
The problem with all of this is that it is very time consuming. Researching an algorithm involves finding, reading and summarizing a large number of books, sample code and research papers and can take a good academic researcher many days to weeks of time to complete.
If you consider that you may want to implement a dozen machine learning algorithms, you could easily be required to invest more than half a year of your time.
Additionally, your own implementations of the code may have bugs that may be difficult for you to find (these algorithms have a way of working inspire of bugs, degrading performance). You may also get caught up with non-intuitive leaps in the mathematics that must be understood before you can implement the method in code.
Short-Cut The Process With Tutorials
You can short-cut this process by following along with and completing tutorials.
Machine learning algorithm tutorials explain the method and show you how to implement an algorithm step-by-step from scratch such that by the end, you have a working implementation. You get all of the benefits of an implementation of an algorithm from scratch, without having to research and decipher textbooks and academic papers.
A good tutorial has a number of principles:
- Step-by-Step: The guide is discrete leading the reader one step at a time through the material, building on previous steps as it progresses.
- Modular: The implementation is broken down into modular parts that are shown and demonstrated independently before being drawn together into a final working demonstration of the whole algorithm.
- Slow: The implementation is slow, introducing one new thing in each progressive step so that the whole can be understood as the sum of the parts.
- Code: A complete working example for each step and for the whole tutorial. It’s obvious, but easy to forget or to mess up by not testing code. All code must be explained and it must execute.
- References: Additional resources and reading must be provided for those readers that want to dive deeper into the material.
- Extensions: After completing the tutorial, there must be suggestion of additional exercises the reader can take on if they are interested in taking the implementation further. A suggestion must be made as to how additional advanced elements can be integrated or how problems with the provided implementation can be addressed.
This is a popular way to learn algorithms and data structures in programming and is an approach that is easily overlooked for its simplicity. As such, there are few good machine learning algorithms tutorials available.
A good resource for python programmers is the book: Programming Collective Intelligence: Building Smart Web 2.0 Applications. This book takes you step-by-step through the creation of a number of machine learning systems, from scratch.
In this post you discovered the benefits of implementing machine learning algorithms from scratch and the confidence and sense of ownership it can provide over complex algorithms.
You discovered the limitations of the approach and how much time may be required in researching, distilling and summarizing algorithms from textbooks and research papers before they can be implemented.
Finally, you discovered that a short-cut is to follow machine learning algorithm tutorials that show you how to implement algorithms from scratch and give you the benefits and spare you from having to do the research.
I am currently preparing a collection of machine learning tutorials on how to implement algorithms from scratch. If this interests you, leave a comment and let me know.