Last Updated on December 18, 2016
Practitioners of practical subjects can suffer from math envy.
This is where they think that mathematicians are smarter than they are and that they cannot excel in a subject until they “know the math”.
I have seen this first hand, and I have seen it stop people from getting started.
In this post, I want to convince you that you can get started and make great progress in machine learning without being strong in mathematics.
Get Started and Learn by Doing
I didn’t learn boolean logic before I started programming.
I just started programming and you probably did to.
I followed an empirical path that involved trial and error. It is slow and I wrote a lot of bad code, but I was passionately interested and I could see progress.
As I built larger and more complicated software programs I devoured textbooks because they let me build my programs better. I hunted for conceptual and practical tools I could use to overcome the limitations I was actually experiencing.
This was a powerful learning tool. If I had started out programming by being forced to learn boolean logic or concepts like polymorphism, my passion would never have been ignited.
The Danger Zone
I like it when my programs don’t work. It means I have to roll up my sleeves and really understand what is going on.
You can get a long way by copy and pasting code without really understanding it. You only need to understand blocks of code as functional units that do a thing you need done. Glue enough of them together and you have a program that solves the problem you need solved.
This empirical hackery is a great way to learn fast, but a terrifying way to build production systems. This is an important distinction to make. The often spoken of “danger zone” is when systems built from empirical learning are made operational and the author does not really know how it works or what the results actually mean.
This is a very real problem. For example, take a look at some I.T. systems and webpages of small businesses that put up with this level of work.
In my mind, a prototype is a ball of copy-pasted mud held together with sticky tape that might sketch out what a solution could look like.
An operational system or a system that produces results or decisions used operationally has no surprises. You feel comfortable having an all day code review with the team picking over every line of code.
You can get started in machine learning today, empirically. Three options available to you are:
- Learn to drive a tool like scikit-learn, R or WEKA.
- Use libraries that provide algorithms and write little programs
- Implement algorithms yourself from tutorials and books.
More than options, this can be the path of the technician from beginner to intermediate that is learning the mathematics required for a technique, just-in-time.
Define small problems, solve them methodically and present the results of what you have learned on your blog. You will start to build up some momentum following this process.
There will be interesting algorithms that you will want to know more about, such as what a particular parameter actually does when you change it or how to get better results from a particular algorithm.
This will drive you to want (need) to understand how that technique really works and what it is doing. You may draw pictures of data flow and transformations, but eventually, you will need to internalize the vector or matrix representations and transformations that are occurring, only because it is the best tools we have available to clearly unambiguously describe what is going on.
You can remain the empiricist. I call this the path of the technician.
You can build up an empirical intuition of which methods to use and how to use them. You can also learn just enough algebra to be able to read algorithm descriptions and turn them into code.
There is a path here for the skilled technician to create tools, plug-in’s and even operational systems that use machine learning.
The technician is contrasted to the theoretician at the other end of the scale. The theoretician can:
- Internalize existing methods.
- Propose extensions to existing methods.
- Devise entirely new methods.
The theoretician may be able to demonstrate the capability of a method in the abstract, but is likely insufficiently skilled to turn the methods into code beyond prototype demonstration systems at best.
You can learn as little or as much mathematics as you like, just in time. Focus on your strengths and be honest about your limitations.
Mathematics is Critical, Later
If you have to learn linear algebra just-in-time, why not learn it fully more completely up front and understand the machine learning methods at this deep level from the beginning?
This is certainly an option, perhaps the most efficient option which is why it is the path used to teach in university. It’s just not the only option available to you.
Just like learning to program by starting with logic and abstract concepts, internalizing machine learning theory may not be the most efficient way for you to get started.
In this post, you learned that there is a path available for the technician separate from that of the theoretician.
You learned that the technician can learn the mathematical representations and descriptions of machine learning algorithms just-in-time. You also learned that the danger zone for the technician is overconfidence and the risk of putting systems into production that are poorly understood.
This might be a controversial post, leave a comment and let me know what you think.