Programmers Can Get Into Machine Learning

In this post I want to show you that programmers can get into machine learning. I will show you that learning machine learning can be just like learning any other piece of high technology. We’ll compare learning machine learning to learning to program in the first place, which may have been an even larger challenge.

Equation

Image license some rights reserved by iwannt

A Designer Wants to Code

Pretend you are a designer, say a young web designer. You make web designs in Photoshop or something and maybe snip up designs and turn them into CSS. You hang around programmers and maybe you have a little coding envy. You think you might want to learn how to code. Rightly or wrongly, you think that CSS and HTML are “practically coding”, that it’s all creative expression, that programming is just another medium or outlet for your creativity.

You jump on Quora or StackOverflow and ask a question like “I’m a designer, how can I learn how to code?

You get answers back that are all over the map. You get seemingly seasoned and expert programmers gifting you free advice like “learn ANSI C and pointers”, “learn binary”, “buy a book on ASM”, “start with LISP”. Maybe an eloquent communicator drops by and writes a long and convincing reply that you really should buy and read Knuth’s The Art of Computer Programming, Vols. 1-3 (Vol 1-3) (Affiliate Link). You take his advice, buy the books on Amazon and get as far as the introduction before regretting the purchase, hating yourself for not being smart enough and giving up on your interest of learning to code, only to repeat the same cycle in 3 months.

What happened? The advice seems like good advice.

The problem is the advice is retrospective. The advice is from programmers thinking about their craft and how people who are already programmers can be better programmers. The advice does not consider an absolute beginner, an interested amateur looking to dip their toe in and see if they want to go for a swim.

Now I agree, the world is a little different now. This problem has been identified and there are great services out there tackling exactly this problem, i.e. teaching people how to code.

Sure, maybe we learn pointers, binary, ASM and LISP and even read sections of Knuth (no one really reads it cover to cover right?), but that comes later. But how did you start out? I started out hacking together things, trying things out, experimenting, creating and learning. I dove deep into the technical detail later because I wanted to create bigger, better and more powerful programs. I didn’t started in the technical detail. I think this experience might be similar for a lot of programmers out there, was it like this for you? Leave a comment.

A Programmer Wants to do Machine Learning

Now, if you’re reading this, you’re probably a programmer or some kind if developer. Think about your interest in machine learning. Have you had a look at some of the responses from seasoned and expert machine learners gifting you free advice on how to get started?

I’ve been searching for and reading this advice and some is good, much of it is not. I’ve collected some samples below:

What math skills are required to learn machine learning?

  • You absolutely will need to be comfortable with basic linear algebra (manipulating vectors and matrices) and working with logarithmic and exponential functions.
  • You’ll need to know Linear Algebra through Eigenvectors if you want things to be “easy”.

Is a strong background in maths a total requisite for ML?

  • You do want to have some familiarity with probability, linear algebra, linear programming, and multivariable calculus.

What skills are needed for machine learning jobs?

  • First, you need to have a decent CS/Math background. ML is an advanced topic so most textbooks assume that you have that background.
  • Statistics, Probability, distributed computing, and Statistics.

There is some really great advice in there, but is this advice appropriate for the absolute beginner? Is it appropriate for the programmer dipping their toe in the water to test the temperature?

Maybe people are asking the wrong questions. Also, I’ve cherry-picked snippets of answers that suggest one needs maths before they can get started in machine learning. The point I want to make is that a beginner is going to focus on what they don’t have and what they can’t do. They may give up before they have even tried.

I absolutely agree that having a strong grasp of linear algebra and probability will provide an excellent base for getting into machine learning. I absolutely agree that The Elements of Statistical Learning (Affiliate Link) is a tremendous book on multiple levels. I just don’t think that the first step a programmer interested in machine learning should do is take a math course or read a dense theoretical treatment of the field. I actually strongly recommend against it.

Two Machine Learning Fields

There are two sides to machine learning:

  • Practical Machine Learning: This is about queries databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill defined questions. It’s the mess of reality.
  • Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.

The practical side cannot have frameworks and rigour without the theoretical side. The theoretical side does not have meaning or motivation without the practical side. This dichotomy is false, it’s really a sea of tools and equations, but stay with me.

As a programmer, you may have a penchant for the practical side, but you will reach the limits as a “technician” and will require an understanding of the theoretical side to be able to effectively improvise. You must read mathematical treatments of algorithms, you must work through dense text books. That is what it takes to do well in the field. And the problem is, that is the advice freely given from practitioners to beginners, which is idealized, suited to the intermediate and is inappropriate to the beginners.

Programmers Like Power Tools

I think it’s useful for an experienced programmer to think of machine learning like an advanced programming topic, like threading (stay with me now).

If you want to get into threading you just write some multithreaded programs and get an idea of the power it can unleash. You make all kinds of blunders, but some of the things you prototype work and you get glimpses of what is possible. If you decide this is for you, you can read books and go deeper.

You can use libraries of existing multithreaded constructs, you can write your own, you can go deep and learn about some of the more mathematical subjects behind the various threading constructs. You let your interest drive you learning and eventually you can credibly build and deploy production multithreaded code. It’s a process, not a step change.

Now obviously, machine learning is a larger and more diverse area, but the general stepped strategy is the one I advocate and will elaborate in a future post.

You don’t go from a beginner to running a team and put machine learning systems into production. The danger zone is real. You can and will learn just enough to be dangerous. But your self discipline learned from wielding the power to program machines (also your process of code reviews, peers and mentors, and common sense) limit those very real dangers.

Just like learning to program, learning machine learning is a journey where the learning does not end, mastery really means continuous education. Learning to read equations, turn them into code, and write your own to frame your problems may be rest stops along the way, if you’re interested.

Resources

I’ve listed some resources if you want to keen thinking on this issue. It’s a little deep and I’m sure we can generate some good discussions.

  • Scroll up and read through some of the answers to the StackOverflow questions listed above. There are answers there that discourage programmers unless they know maths, but there are also other really encouraging answers that will really lift you up.
  • Why becoming a data scientist might be easier than you think A Gigaom post that points to case studies for the general opportunity for the analytically inclined (like programmers) to start from scratch in Data Science and quickly become internationally competitive.
  • Is mathematics necessary for programming? Interestingly, I think the arguments for and against are very relevant and a useful perspective.

This is a pretty charged post and I’m really interested in what you think. Discussing this issue with friends, I do hear a lot about the danger zone and the options for progression for “the technician” machine learning apprentice. I will follow up on these two topics in future posts.

What do you think? Is there an substance to my proposed similarity between learning to program and for a programmer to learn machine learning?

18 Responses to Programmers Can Get Into Machine Learning

  1. Shantnu December 1, 2013 at 10:12 am #

    Good points.

    I think the problem with most “experts” is, they hang out with their own kind all the time, they get a narrow view of reality. Pretty soon, they lose touch with the outside world, and forget how hard it is for those just starting in. Like elephants/horses with eye flaps, they can only see in one direction.

    I was reading a blog recently, where a guy was moaning how the people were sick of “all these Bootstrap websites that look the same.” What he meant of course was, that he was sick. The problem is, the average user doesn’t look at hundreds of web apps a month. They look at maybe 4-5, of which 3 are Gmail, Facebook and Yahoo/Twitter/something. So unless your website is aimed at professional designers, your target market will not be “sick of seeing 100s of similar websites”.

    I disagree with the technician danger zone you mention- to me, it sounds like the “Oh we spent five years studying this, how dare this noob try to upstart try to upstage us. Remember your place, kid.” But, I will withhold judgement till you write your blog posts.

    You make some really good points about how beginners are being pushed towards books that are completely irrelevant to them. I think there maybe 2 reasons- one is a simple lack of empathy. Most programmers are not social people, and their social skills are poor at best (though why they don’t develop them, I don’t know).

    The 2nd, more insidious reason, the one that really gets me, and the reason I stopped going to many forums, is this “macho culture”. Real Men (TM) program in C, Assembly and Lisp. They all have Phds in Maths, and spend their spare time writing beautiful code in Haskell; none of this fancy Javascript for us, thanks.

    But I will shut up, as it looks like my comment is getting as long as your blog.

    Good blog, keep the posts coming.

    • jasonb December 2, 2013 at 8:04 am #

      Hi Shantnu, great comment.
      I think you’re spot on when you say experts spend time with experts and forget what it’s like starting out. I think this can explain why beginners are given advice that is more appropriate for intermediates to advanced levels. You’re analogy with the guy and bootstrapped websites looking the same is apt.
      I think there is a danger zone, that it people could learn just enough to be over confident and build systems or generate results that they cannot interpret – which could do more harm than good to themselves or their organization. My point is that similar danger zones exist on other technical fields, such as programming! It’s an interesting thought and I don’t have it all straight in my head you, it might make a great future blog post.
      Thanks again for your thoughts, they’re very much appreciated.

  2. qnaguru February 17, 2014 at 7:33 pm #

    For a newbie I guess best place to start is with Tools – let the Tools do the Statistics for you.
    Once you get a hang of it – you an start uncovering what the Tool is actually doing underneath…

    I read something on these lines perhaps on this site or elsewhere 🙂 And i agree.

  3. Kartik January 8, 2015 at 4:58 pm #

    hey, jason
    I have been following your blog since a long time.
    this is how I am trying to learn machine learning.
    Tell me if I am on the right path.
    I have started with coursera ML andrew-NG course.

    what I do is after I have watched the video I try out the exercise i.e mostly implement the whole algorithm and code the equations and pass the clean data through it.
    so, that I can understand the code and equations.

    After that I implement the same exercise using sklearn library which I find pretty easy to use.

    then may be couple of small already implemented projects using the library.
    some from the book machine learning in action.

    I have not used the tools yet, I have downloaded orange .. but not used it yet.
    I am also thinking on participating in kaggle competitions or download some data and do some analysis and some web scrapping projects.

    I am not learning linear algebra, probability, statistics as I have basic knowledge of it but, not in depth. but I pretty much can understand the algorithm.

    Am I on the right track ???

    • Jason Brownlee January 9, 2015 at 5:17 am #

      Hi Kartik, it sounds like you’re doing very well.

      One additional thing you could do is to share the results of your small projects on a blog or github and start building up a public portfolio. This would help to show you how far you have come and demonstrate what you are capable of.

  4. Simon Shen January 23, 2015 at 1:06 pm #

    My name is Simon, I am a computer engineering master student in University of Florida. This semester I am focusing on machine learning and data mining by taking courses as “Pattern Recognition” and “Big data ecosystem”. I am just a beginner. Your blog is really good! I hope I can communicate with you during study!

    • Jason Brownlee January 23, 2015 at 1:41 pm #

      Reach out any time Simon. Good luck with your study!

  5. Phil Marneweck August 24, 2015 at 6:04 pm #

    If you don’t want to do the run of the mill thing like recognize faces where some one has already written the code for it with clear examples of its usage, you are dead in the water without the maths and statistics ,

    I have tried various blogs books and online courses which claim they use minimal maths … pffffffft. Linear algebra is not minimal maths, neither is the statistics.

    Using other peoples implementations of ML algorithms is not simple or clear either because if you get down to it there are a lot of ways each ML algorithm can be implemented and the choices made by the implementers has an effect on how or why an ML algorithm will work for your case or not.

    I would say start with the math and then go on to the statistics and only then look at ML or you will be very frustrated very quickly. A few months spent on the maths and statistics will stand you in good stead in the future with or without ML.

  6. Pravallika August 16, 2016 at 2:15 pm #

    HI Jason,
    You said that we must have a strong background in CS / Mathematics but actually I done Engineering from stream of Electrical and Electronics stream but I am having good knowledge in Mathematics. Can I get Machine Learning even I am not having any experience.

    • Jason Brownlee August 17, 2016 at 9:48 am #

      Absolutely.

      Focus on learning how to work through a machine learning problem end to end and delivering value. The specific tools and languages are just a means towards this end.

  7. Praveen January 19, 2017 at 6:59 pm #

    Hello Jason,

    I follow all your clear strategies you discuss in your site. I feel sorry for missing this blog in ur site, really useful and the analogy u have given is quite true and also happy that I’m following same path 🙂 …. Also, ur book ‘Master Machine Learning Algorithms’ is really good. Your hard work is well appreciated.

  8. Tom Aronson April 29, 2017 at 11:06 pm #

    One of the large problems in Machine Learning, Information Technology, or other fields is the company fascination with recruiters or “headhunters”, Typically there are job descriptions with multiple bullet points. Headhunters represent their employer clients. Many, and it has been my experience to even say most, have no idea what they are selling. So possibly a beginner, or even an experienced developer or person who may be pretty strong in math, statistics, whatever reads these points. Of course most have the requirement: “Phd in Math, Statistics, Physics, Computer Science or other technical discipline” required. So of course those without a doctorate quit there. This is a screen so the company and headhunter don’t waste their valuable time (sarcasm implied here.)

    I think the solution is to learn the material at the level you can and increase your knowledge as the post here suggests. Then try to discover where you can learn on the job and also contribute through the hidden job market – through meetup groups, researching those in the industry and other companies who work with ML and related. I have several Masters degrees in technical areas and many years of Information Technology experience and have taken many math, statisrics, and other courses. I cannot make a real dent using the advertised and headhunter route so I have to use different ways also – not easy.

    Keep up the good posts Jason.

    • Jason Brownlee April 30, 2017 at 5:30 am #

      Thanks for sharing, great points.

      I would also point out that there are millions of small-medium companies that need people with these skills where you can talk to the managers directly during the interview process and level with them about delivering real value on low hanging fruit.

      Let the fortune 500 have the phd-level guys and the fortune 5,000,000 can get on with getting results.

  9. Jason Holtkamp May 9, 2017 at 12:33 pm #

    Thanks so much for writing this. Your analogies really struck a chord with me because it reminded me of myself during the first few weeks and months of learning to code. I initially got into programming by teaching myself and eventually fell in love with it, so after a while I did a software engineering bootcamp (Hack Reactor). Now I work as a javascript engineer.

    Making the jump from JS engineer to ML engineer used to seem like an insurmountable goal because I don’t have a degree in math/stats/computer science, but the more I read, the more I’m encouraged to start working toward it. In fact, it was only about 2 years ago that I thought it was impossible to make the jump from management consultant to software developer (I was wrong).

    Definitely bookmarking this site and sending the link to friends. Thanks again Jason.

    • Jason Brownlee May 10, 2017 at 8:41 am #

      Thanks Jason, I’m so glad to hear that.

      Stick with it! Focus on getting results and back-fill understanding later.

  10. Vlad September 15, 2017 at 8:10 pm #

    amen for an article that speaks the truth. started learning python a few weeks ago (first time learning to code), motivated by all the accomplishments of machine learning. useless to say, the guys where i took the course were trying to teach python as if i was a programmer learning a second or n-th type of code. which was not the case, on the contrary, programming didn’t work for me at a fragile age, i’ve thought that was sad and insisted. So here i am, saying thank you for a great blog. You keep my motivation alive, in spite of the mistakes other teachers made.

Leave a Reply