Why Implement a Machine Learning Algorithm From Scratch

Why would you ever implement machine learning algorithms from scratch when there are so many provided in existing APIs?

This is a great question. One that must be considered before you write that first line of code.

In this post you will discover a variety of interesting and even thought-provoking answers to this question.

The answers in this post are summarized from the Quora question titled: “Why is there a need to manually implement machine learning algorithms when there are many advanced APIs like tensorflow available?“.

Kick-start your project with my new book Machine Learning Algorithms From Scratch, including step-by-step tutorials and the Python source code files for all examples.

Why Implement a Machine Learning Algorithm From Scratch

Why Implement a Machine Learning Algorithm From Scratch
Photo by psyberartist, some rights reserved.

Two Major Reasons To (re)Implement an Algorithm

I think that all of the answers can be broken down into two camps:

  1. Self Study where an algorithm is implemented as a learning exercise.
  2. Operational Requirements where an algorithm is implemented to meet needs of a production system.

Implement Algorithms for Self-Study

Charles Gee gives a great answer from the self-study perspective. He comments:

… suppose that instead of talking about machine learning algorithms, we were talking about sorting algorithms. Sure, many data structures have a sort function that requires little to no coding, but would you really hire a programmer who couldn’t do a bubblesort? selectionsort? insertionsort? mergesort? quicksort? binary search tree?

Charles describes 4 different use cases where it may be highly desirable to implement a machine learning algorithm from scratch:

  • As a beginner in the machine learning realm.
  • As a researcher in the machine learning realm.
  • As a teacher in the machine learning realm.
  • As a user of these machine learning algorithms.

Implement Algorithms for Operational Requirements

Xavier Amatriain focuses on this topic in his answer. He comments:

Let me start by saying that I do believe that any team should default to re-using existing implementations. … However, there are also many reasons why a company may decide implement their own version of an ML algorithm.

Xavier lists 5 reasons to implement a machine learning algorithm, as follows:

  • Performance. Open source implementations may be too general and not efficient enough for specific use cases.
  • Correctness. There may be bugs or limitations in the open source implementations for specific use cases (such as larger scale datasets).
  • Programming Language. Implementations may be limited to specific programming languages.
  • Integration. There may be a need to integrate an algorithm implementation into the infrastructure of existing production systems.
  • Licensing. There may be limitations imposed by the choice of open source license.

Summary

In this post, you discovered that there are two major reasons why you might want to implement an algorithm from scratch.

  1. To learn more about how the algorithm works for self-study.
  2. To customize the implementation of the algorithm for a production system.

Further Reading

I have posted a number of times on the benefits of implementing machine learning algorithms from scratch.

Some further reading on this topic includes:

Discover How to Code Algorithms From Scratch!

Machine Learning Algorithms From Scratch

No Libraries, Just Python Code.

...with step-by-step tutorials on real-world datasets

Discover how in my new Ebook:
Machine Learning Algorithms From Scratch

It covers 18 tutorials with all the code for 12 top algorithms, like:
Linear Regression, k-Nearest Neighbors, Stochastic Gradient Descent and much more...

Finally, Pull Back the Curtain on
Machine Learning Algorithms

Skip the Academics. Just Results.

See What's Inside

No comments yet.

Leave a Reply