Practitioners of practical subjects can suffer from math envy.

This is where they think that mathematicians are smarter than they are and that they cannot excel in a subject until they “know the math”.

I have seen this first hand, and I have seen it stop people from getting started.

In this post, I want to convince you that you can get started and make great progress in machine learning without being strong in mathematics.

## Get Started and Learn by Doing

I didn’t learn boolean logic before I started programming.

I just started programming and you probably did to.

I followed an empirical path that involved trial and error. It is slow and I wrote a lot of bad code, but I was passionately interested and I could see progress.

As I built larger and more complicated software programs I devoured textbooks because they let me build my programs better. I hunted for conceptual and practical tools I could use to overcome the limitations I was actually experiencing.

This was a powerful learning tool. If I had started out programming by being forced to learn boolean logic or concepts like polymorphism, my passion would never have been ignited.

## The Danger Zone

I like it when my programs don’t work. It means I have to roll up my sleeves and really understand what is going on.

You can get a long way by copy and pasting code without really understanding it. You only need to understand blocks of code as functional units that do a thing you need done. Glue enough of them together and you have a program that solves the problem you need solved.

This empirical hackery is a great way to learn fast, but a terrifying way to build production systems. This is an important distinction to make. The often spoken of “danger zone” is when systems built from empirical learning are made operational and the author does not really know how it works or what the results actually mean.

This is a very real problem. For example, take a look at some I.T. systems and webpages of small businesses that put up with this level of work.

In my mind, a prototype is a ball of copy-pasted mud held together with sticky tape that might sketch out what a solution could look like.

An operational system or a system that produces results or decisions used operationally has no surprises. You feel comfortable having an all day code review with the team picking over every line of code.

## The Technician

You can get started in machine learning today, empirically. Three options available to you are:

- Learn to drive a tool like scikit-learn, R or WEKA.
- Use libraries that provide algorithms and write little programs
- Implement algorithms yourself from tutorials and books.

More than options, this can be the path of the technician from beginner to intermediate that is learning the mathematics required for a technique, just-in-time.

Define small problems, solve them methodically and present the results of what you have learned on your blog. You will start to build up some momentum following this process.

There will be interesting algorithms that you will want to know more about, such as what a particular parameter actually does when you change it or how to get better results from a particular algorithm.

This will drive you to want (need) to understand how that technique really works and what it is doing. You may draw pictures of data flow and transformations, but eventually, you will need to internalize the vector or matrix representations and transformations that are occurring, only because it is the best tools we have available to clearly unambiguously describe what is going on.

You can remain the empiricist. I call this the path of the technician.

You can build up an empirical intuition of which methods to use and how to use them. You can also learn just enough algebra to be able to read algorithm descriptions and turn them into code.

There is a path here for the skilled technician to create tools, plug-in’s and even operational systems that use machine learning.

The technician is contrasted to the theoretician at the other end of the scale. The theoretician can:

- Internalize existing methods.
- Propose extensions to existing methods.
- Devise entirely new methods.

The theoretician may be able to demonstrate the capability of a method in the abstract, but is likely insufficiently skilled to turn the methods into code beyond prototype demonstration systems at best.

You can learn as little or as much mathematics as you like, just in time. Focus on your strengths and be honest about your limitations.

## Mathematics is Critical, Later

If you have to learn linear algebra just-in-time, why not learn it fully more completely up front and understand the machine learning methods at this deep level from the beginning?

This is certainly an option, perhaps the most efficient option which is why it is the path used to teach in university. It’s just not the only option available to you.

Just like learning to program by starting with logic and abstract concepts, internalizing machine learning theory may not be the most efficient way for you to get started.

In this post, you learned that there is a path available for the technician separate from that of the theoretician.

You learned that the technician can learn the mathematical representations and descriptions of machine learning algorithms just-in-time. You also learned that the danger zone for the technician is overconfidence and the risk of putting systems into production that are poorly understood.

This might be a controversial post, leave a comment and let me know what you think.

This is exactly what I did at Cal. Started off with the technician route, struggled and studied through really intense ML classes, learning as I went. The math was definitely intimidating, but over time I got a lot better at it. While Ive got a ways to go before becoming the theoretician, the more I do, the more intuition I develop. Also, learning theory is a lot more fun once you have some sense of why you need it. Learning the limitations of simple hacking is a strong motivator for doing more math and CS stuff.

Thanks for sharing Rishi, it’s great yo hear that you had a similar experience.

Great advice!

Thanks Alex!

Awesome post. You spoke to the core of my existence as a software engineer and also the route I have been taking in machine learning. I appreciate you taking the time to inspire me to keep pressing on.

I’m really glad to here that Mike! Hang in there mate.

Thanks for this post. I found it by searching for an answer to the question “why I don’t get machine learning”. I guess you can say I am on the “theoretician” path right now, trying to learn machine learning in university without having sound math nor statistics background. It is a very daunting experience. Reading textbooks in machine learning without understanding most of the mathematics/statistics terminology is like reading in a language you don’t know. But I am able to solve problems with scikit-learn and your reminder of how I started learning programming helps me to keep going.

Great to hear, hang in there Stefan and ask lots of questions!

An enormous source of encouragement for a person with little background in Statistics & Probability!!

No words can fully describe my heart-felt gratitude towards your contribution to the quest of Machine Learning for all!

God bless!

Great advice.

I started coursera course on machine learning. While it is a great course, I am almost giving up on it, due to the theory intensive approach. I definitely have learnt a lot. But as a programmer, I would swear by your approach.

Thanks!

But what if one wanted to improve on the math needed in ML, to get from math novice to advanced? What are the main areas to study (foundations/fundamentals/prelim concepts and definitions)? do you have a post on this?

Thanks!

The project centric approach is by far the most effective. Mathematics makes sense and actually sticks when its applied in context.

I just love this post simply because I am a programmer first who wants to understand the problem and solve it with “code”. I do not understand a lot of stats/maths/linear algebra.

Your top down approach should be the go to approach for programmers in this field.

Another reason for top-down is this. Once you hit the math, it perks your ears to the applicable concepts when they come. I took stats once & remember thinking “Eigenvectors? When the heck am I gonna use these?”. Now I’m going back, re-learning and eating it up.

I personally like the “sweet and mop” approach. Sweep: take a basics on math (even some quickie YouTube vids); learn the basics of ML, enough to use scikit-learn. Now start doing, and 1h each day mop: start from the beginning, and thoroughly learn it all. That is: (1) quickie top-down, (2) do, (3) thorough bottom-up.

There still seems to be a lot of people out there who believe they can never become proficient at mathematics. This really holds a lot of people back and it shouldn’t.

That could be, in part, because of math anxiety (I wrote about this recently http://www.mlopt.com/?p=73). It could also be due to the abstract manner in which math is taught in schools. It isn’t easy to learn abstract concepts, but it’s also unnecessary in this case. You can learn the math effectively by applying it to a concrete problem. That means diving in to the machine learning, and dealing with the math as it is presented. In my view it is a mistake to avoid the math, especially early in the learning process when the student is becoming acquainted with the algorithms.

thank you so, so much for your piece on math anxiety. as someone who just started their computer science track (undergrad), and aspires to go into artificial intelligence, with very little background in math and terrible math anxiety.. this helped a lot. I know that if I can do it in the code, I should be able to do it on paper. yet I’m trying to avoid a calculus/math course at all cost. this really motivated me to do better and just keep practicing.

Hang in there!

I have a B.S. in Mathematics, and I agree with your approach. I certainly don’t remember everything I studied in college, and I have spent my career as a software developer learning things as-needed (just-in-time). And you are right, once you have the motivation to learn something, because it solves an immediate problem you have, you will learn it. I also agree with James above that people should not fear math – however, math anxiety is real, possibly the result of how we teach math to kids. Another point of view that your post alludes to is, spend your time solving the RIGHT problem; there are lots of problems out there, so spend your time wisely. Once you are sure your time will be well spent, the learning will be much easier.

Great , thank you jason

You’re welcome Michael.

Hello, currently I’m taking a MOOC from CalTech and it’s math intensive and sometimes I got very angry and frustrated for not understanding the concepts and the math involved, it makes me feel dumb! Thanks for this post, it gives me some motivation to give it a second chance.

I’m glad it helped Adolfo.

I do agree with your very practical approach. A new area like machine learning is quite exciting for people right from a researcher to a software professional. For me, machine learning provides me a new way to address my challenging problems. A formal learning will provide me good foundation to understand machine learning better. But I am quite keen on applying it to solve my problems. While I can pick up theory by putting extra effort, I would love to do just-in time when and how much is needed. Yes, I am motivated to learn/apply new stuff wearing a practical hat, Thanks, Ravi

Thanks Ravi.

I am interested in Machine Learning and I am going to follow the above approach. Thank you

I’m glad to hear that, hang in there!