Why Implement a Machine Learning Algorithm From Scratch

By Jason Brownlee on August 13, 2019 in Code Algorithms From Scratch 0

Why would you ever implement machine learning algorithms from scratch when there are so many provided in existing APIs?

This is a great question. One that must be considered before you write that first line of code.

In this post you will discover a variety of interesting and even thought-provoking answers to this question.

The answers in this post are summarized from the Quora question titled: “Why is there a need to manually implement machine learning algorithms when there are many advanced APIs like tensorflow available?“.

Kick-start your project with my new book Machine Learning Algorithms From Scratch, including step-by-step tutorials and the Python source code files for all examples.

Why Implement a Machine Learning Algorithm From Scratch
Photo by psyberartist, some rights reserved.

Two Major Reasons To (re)Implement an Algorithm

I think that all of the answers can be broken down into two camps:

Self Study where an algorithm is implemented as a learning exercise.
Operational Requirements where an algorithm is implemented to meet needs of a production system.

Implement Algorithms for Self-Study

Charles Gee gives a great answer from the self-study perspective. He comments:

… suppose that instead of talking about machine learning algorithms, we were talking about sorting algorithms. Sure, many data structures have a sort function that requires little to no coding, but would you really hire a programmer who couldn’t do a bubblesort? selectionsort? insertionsort? mergesort? quicksort? binary search tree?

Charles describes 4 different use cases where it may be highly desirable to implement a machine learning algorithm from scratch:

As a beginner in the machine learning realm.
As a researcher in the machine learning realm.
As a teacher in the machine learning realm.
As a user of these machine learning algorithms.

Implement Algorithms for Operational Requirements

Xavier Amatriain focuses on this topic in his answer. He comments:

Let me start by saying that I do believe that any team should default to re-using existing implementations. … However, there are also many reasons why a company may decide implement their own version of an ML algorithm.

Xavier lists 5 reasons to implement a machine learning algorithm, as follows:

Performance. Open source implementations may be too general and not efficient enough for specific use cases.
Correctness. There may be bugs or limitations in the open source implementations for specific use cases (such as larger scale datasets).
Programming Language. Implementations may be limited to specific programming languages.
Integration. There may be a need to integrate an algorithm implementation into the infrastructure of existing production systems.
Licensing. There may be limitations imposed by the choice of open source license.

Summary

In this post, you discovered that there are two major reasons why you might want to implement an algorithm from scratch.

To learn more about how the algorithm works for self-study.
To customize the implementation of the algorithm for a production system.

Navigation

Why Implement a Machine Learning Algorithm From Scratch

Two Major Reasons To (re)Implement an Algorithm

Implement Algorithms for Self-Study

Implement Algorithms for Operational Requirements

Summary

Further Reading

Discover How to Code Algorithms From Scratch!

No Libraries, Just Python Code.

Finally, Pull Back the Curtain on
Machine Learning Algorithms

More On This Topic

No comments yet.

Leave a Reply Click here to cancel reply.

Navigation

Two Major Reasons To (re)Implement an Algorithm

Implement Algorithms for Self-Study

Implement Algorithms for Operational Requirements

Summary

Further Reading

Discover How to Code Algorithms From Scratch!

No Libraries, Just Python Code.

Finally, Pull Back the Curtain on Machine Learning Algorithms

More On This Topic

No comments yet.

Leave a Reply Click here to cancel reply.

Finally, Pull Back the Curtain on
Machine Learning Algorithms