Best Machine Learning Resources for Getting Started

This was a really hard post to write because I want it to be really valuable.

I sat down with a blank page and asked the really hard question of what are the very best libraries, courses, papers and books I would recommend to an absolute beginner in the field of Machine Learning.

I really agonized over what to include and what to exclude. I had to work hard to put myself in the shoes of a programmer and beginner at machine learning and think about what resources would best benefit them.

I picked the best for each type of resource. If you are a true beginner and excited to get started in the field of machine learning, I hope you find something useful. My suggestion would be to pick one thing, one book or one library and read it cover to cover or work through all of the tutorials. Pick one and stick to it, then once you master it, pick another and repeat. Let’s get into it.

Programming Libraries

I am an advocate of “learn just enough to be dangerous and start trying things”.

This is how I learned to program and I’m sure many other people learned that way too. Know your limitations and exploit your strengths. If you know how to program, leverage that to get deep into machine learning fast. Then have the discipline to go and learn the math for the technique before you implement it a production system.

Find a library and read the documentation, follow the tutorials and start trying things out. The following are the best open source machine learning programming libraries out there. I don’t think they are all suitable for using in your production system, but they are ideal for learning, exploring and prototyping.

Start with a library in a language you know well then move on to other more powerful libraries. If you’re a good programmer, you know you can move from language to language reasonably easily. It’s all the same logic, just differing syntax and APIs.

  • R Project for Statistical Computing: This is an environment and a lisp-like scripting language. All the stats stuff you could ever want to do will be provided in to R, including amazing plotting. The Machine Learning category on CRAN (think: third-party Machine Learning packages) has code written by leaders in the field with state of the art methods, as well as anything else you can think of. Learning R is a must if you want to prototype and explore quickly. It just might not be the first place you start.
  • WEKA: This is a Data Mining workbench providing API, and a number of command line and graphical user interfaces for the whole data mining lifecycle. You can prepare data, visualize explore, build classification, regression and clustering  models and many algorithms are provided built in as well as provided in third party plugins. Not related to WEKA, Mahout is a good Java framework for Machine Learning on Hadoop infrastructure if that is more your thing. If you’re new to big data and machine learning, stick with WEKA and learn one thing at a time.
  • Scikit Learn: Machine Learning in Python built on top of NumPy and SciPy. If you are a Python or a Ruby programmer, this is the library for you. It’s friendly, powerful and comes with excellent documentation. Orange would be a good alternative if you’d like to try something else.
  • Octave: If you are familiar with MatLab or you’re a NumPy programmer looking for something different, consider Octave. It is an environment for numerical computing just like Matlab and makes it easy to write programs to solve linear and non-linear problems, such as those that underlie most machine learning algorithms. If you have an engineering background, this might be a good place for you to start.
  • BigML: Maybe you don’t want to do any programming. You can drive tools like WEKA completely without programming. You can go one step further and use services like BigML that offer machine learning interfaces on the web where you can explore building models all in the browser.

Pick a platform and use it to do your practical machine learning education. Don’t just read, do.

Video Courses

Video is a very popular way to get started in machine learning.

I watch a lot of machine learning videos on YouTube and VideoLectures.Net. The risk is that all you will do is consume and fail to take action. I recommend you should always take notes when watching a video, even if you discard the notes later. I also recommend trying out whatever it is you’re learning in the lecture.

Frankly, none of the video courses I have seen are really suitable for a beginner, for a true beginner. They all presuppose a working knowledge of at least linear algebra and probability theory, and more.

Andrew Ng’s Stanford lectures are probably the best place to start for a course, otherwise there are one-off videos I recommend.

  • Stanford Machine Learning: Available via Coursera and taught by Andrew Ng. In addition to enrolling, you can watch all the lectures anytime and get the handouts and lecture notes from the actual Stanford CS229 course. The course includes homework and quizzes and focuses on linear algebra and using Octave.
  • Caltech Learning from Data: Available via edX and taught by Yaser Abu-Mostafa. All the lectures and materials are available on the CalTech site. Again, like the Stanford class, you can take it at your own pace and complete the homework and assignments. It covers similar subjects and goes into a little bit more details and is more mathematical. The homework is probably too challenging for a beginner.
  • Machine Learning Category on VideoLectures.Net: This is an easy place to drown in the overload of content. Look for videos that seem interesting and try them out. Bail if it’s at the wrong level or take notes if you’re enjoying it. I find I keep coming back to refresh myself on topics and to pickup entirely new topics. Also, it’s great to see what the masters of the field actually look like.
  • “Getting In Shape For The Sport Of Data Science” – Talk by Jeremy Howard: A talk to a local R users group on the practical process for doing well in competitive machine learning. This is very valuable because so few people talk about what it’s actually like to work on a problem and how to do it. I not-so-secretly fantasise about funding a web reality TV show that follows participants in machine leaning competitions. That’s how into it I am!

Overview Papers

If you are not used to reading research papers, you will find the language very stiff. A paper is like a snippet of a textbook, but describes an experiment or some other frontier of the field. Nevertheless, there are some papers that you might find interesting if you are looking to get started in machine learning.

  • The Discipline of Machine Learning: A white paper defining the discipline of Machine Learning by Tom Mitchell. This was a piece of the argument Mitchell used to convince the President of CMU to create a standalone Machine Learning department for a subject that will still be around in 100 years (also see this short interview with Tom Mitchell).
  • A Few Useful Things to Know about Machine Learning: This is a great paper because it pulls back from specific algorithms and motivates a number of important issues such as feature selection generalizability and model simplicity. This is all good stuff to get right and think clearly about from the beginning.

I’ve only listed two important papers, because reading papers can really bog you down.

Beginner Machine Learning Books

There are a lot of machine learning books and very few are written for beginners.

What is a beginner really?

Most likely you’re coming to machine learning from another field, most likely computer science, programming or statistics. Even then, most books expect you to have a grounding in at least linear algebra and probability theory.

Nevertheless, there are a few books out there that encourage eager programmers to get started by teaching the minimum intuition for an algorithm and point to tools and libraries so that you can run off to and try things out.

Most notably Programming Collective Intelligence, Machine Learning for Hackers and Data Mining: Practical Machine Learning Tools and Techniques for Python, R, and Java respectively. If in doubt, grab one of these three books!

Books for Machine Learning Beginners

Books for Machine Learning Beginners

  • Programming Collective Intelligence: Building Smart Web 2.0 Applications: This book was written for you dear programmer. It’s lite on theory, heavy on code examples and practical web problems and solutions. Buy it, read it, do the exercises.
  • Machine Learning for Hackers: I’d recommend this book after reading Programming Collective Intelligence (above). It again provides worked examples that are practical, but it has a more of a data analysis flavor and uses R. I really like this book!
  • Machine Learning: An Algorithmic Perspective: This book is like a more advanced version of Programming Collective Intelligence (above). It has similar aims (get programmers started in Machine Learning), but it includes maths and references as well as examples and snippets in python. I’d recommend reading this after reading Programming Collective Intelligence if you’re still interested.
  • Data Mining: Practical Machine Learning Tools and Techniques: I actually started with this book, actually it was the first edition and it was about the year 2000. I was a Java programmer and this book and the companion library WEKA provided a perfect environment for me to try things out, implement my own algorithms as plug-ins and generally practice Machine Learning and the broader process of Data Mining. I highly recommend this book and this path.
  • Machine Learning: This is an old book and does include formulas and lots of references. It’s a textbook but is also very accessible with grounded motivations for each algorithm.

A lot of people bang on about some great machine learning textbooks. I do too, and they are great. They are just not a great place for a beginner to start I think.

Further Reading

I thought deeply about this post and I also went off and looked at other people’s lists of resources to make sure I didn’t miss anything important.

For completeness, here are some other great lists of resources around the web for getting started in machine learning.

Have you read or used any of the resources here?

What did you think?

Did I leave out a critically useful resource for a programmer interested in getting started in machine learning?

Please leave a comment and let me know about it!

52 Responses to Best Machine Learning Resources for Getting Started

  1. Avatar
    Shantnu November 27, 2013 at 8:02 am #

    Thanks Jason, that looks useful. Though to be honest, I avoid most books/courses written by academics on principle. So I will continue following your blog for the time being. 🙂

    A question: In you language recommendation, you recommend R and Octave. Aren’t they statistical / signal processing? Is machine learning like signal processing?

  2. Avatar
    jasonb November 27, 2013 at 8:43 am #

    Thanks for the support Shantnu!
    Good practice with textbooks, they are written for undergrads or more typically graduate level courses. They are not a good place to start if you’re an interested programmer.

    I suggest that you start with a language you are familiar with and find a machine learning library for that language. Alternatively, there are multi-platform tools like WEKA that provide a user interface to start playing around.

    Octave, like Matlab is used for DSP and related engineering subjects. R is a statistical language and environment. Both very powerful in the right hands. I’d only recommend exploring Machine Learning with either of these languages/environments if you are already familiar with them. Otherwise you’ll be learning two things at once (language and machine learning) and making your life unnecessarily difficult.

    You could phrase machine learning tasks as DSP problems, I have seen that done most in the area of neural networks. If that perspective helps you, then give it a try.

    I hope this helps!

  3. Avatar
    Shantnu November 27, 2013 at 10:44 am #

    It does, thanks.

    Looking forward to more posts…

  4. Avatar
    niket December 17, 2013 at 4:35 am #

    Thanks for sharing your experience, Jason.

    I have been learning/studying ML by trial and error for last 1+ year. After a lot of courses/videos/books pickups and drops I can distill my experience as below:

    Courses I zeroed in are the following:
    1. ML course by Andrew Ng @ coursera
    2. Learning from data by Yaser.

    Taking these two courses in close succession really helped me a lot. Of course I keep coming back to them again and again aided by materials of Stanford CS229.

    Books: I felt that the first two books mentioned by you were not really helpful in a constructive way. But they really helped me to realise that ML can’t be learnt by blindly following these algorithmic approaches. It was great to see the outputs but the ever ringing question “why” really kept frustrating me. And then optimum dose of Linear algebra, Probability and Statistics @ Khan academy and Stats trilogy by Ani Adhikari @ Edx really helped me a lot in building up the foundation.

    It has been an iterative effort and most probably it will continue that way.

    Thanks again.

    • Avatar
      jasonb December 17, 2013 at 7:58 am #

      Great comment @niket.
      Nice work for getting through both courses, Yaser’s is significantly more challenging.

      I agree that the algorithms will not click (the why) until you have an intuition for what they are doing. The methods are explained using mathematics, and so that is the intuition we adopt. I think that beginners can get a long way with spatial metaphors and analogy before having to dig down into the maths. I also think there are plenty of methods (instance based, trees, etc) where you can get along just fine with such intuitions.

      I’m keen to hear your thoughts on this @niket.

    • Avatar
      Abhirup February 18, 2018 at 6:55 pm #

      Hey Niket , can you please link the course that you mentioned that is statistics trilogy by ani adhikari as a reply !

      Thank you

  5. Avatar
    Deepan Prabhu Babu January 30, 2014 at 5:04 am #

    You forgot to add, http://www.youtube.com/playlist?list=PLD0F06AA0D2E8FFBA to video lectures category.
    Mathematical monk has one of the simplest and straight forward video tutorials on machine learning.

  6. Avatar
    ankit May 3, 2014 at 4:35 am #

    The link sent on subscribing for downloading resource guide does not work. kindly help

    • Avatar
      jasonb May 3, 2014 at 7:33 am #

      I have sent it to you directly.

  7. Avatar
    Nilesh May 29, 2014 at 2:51 am #

    Thank you very much Jason. I am new to this area but keenly interested to go in details, so I was finding out from where to start. Your inputs related to books, video lectures and other resources motivates to go further. I am hoping to get a help from you whenever I stuck. Thanks once again.

  8. Avatar
    Harshal July 26, 2014 at 4:45 am #

    Hi, Jasonb
    Your site is best for beginners as well as master.

    I’m working on machine learning in that specially Reinforcement Learning I want programming for RL with its implementation basics. I hope that you will explore this area??

    • Avatar
      jasonb July 26, 2014 at 7:48 am #

      I have not covered reinforcement learning, but I can look to cover it in the future.

  9. Avatar
    Mohammad September 19, 2014 at 7:39 pm #

    Hi
    don’t bother to start coding on my lower level and a few projects from MATLAB to progressively harder to define in advance the field path to take.
    tnx.

  10. Avatar
    Daniel September 20, 2014 at 10:56 pm #

    Thanks for this post. Excellent! I very much appreciate that you took the time to put it together. I’m a few weeks in to Andrew’s class. I was wondering where to go from there. This is awesome.

  11. Avatar
    Joef C October 23, 2014 at 2:28 am #

    hey! great post. I’ve been looking for this kind of list. I already started with Practical Machine Learning under Data Science specialization in Coursera and it’s kind of hard to keep up with the terminologies they’re using.

    again, great post!

    • Avatar
      jasonb October 23, 2014 at 7:38 am #

      Thanks Joef. Good luck with your course.

  12. Avatar
    Sidd December 7, 2014 at 11:12 pm #

    hi JasonB, I have gone through andrew Ng coursera course and statistical learning by Stanford but I am still finding it difficult to read research papers due to mathematics, so can you suggest some mathematical resources that can help me get quickly started in understanding papers. thank you.

  13. Avatar
    vivek December 14, 2014 at 12:42 pm #

    Great post

  14. Avatar
    Vincent March 10, 2015 at 7:21 pm #

    Greate list of resources, tnx!
    One of my favorite machine learning books is ‘Pattern Classification’ from Richard Duda.
    This was the first book I read in this field and I think it provides beginners with the perfect mix between theoretical background and practical thinking.
    My top-4 machine learnings books is http://www.visiondummy.com/machine-learning-books/

  15. Avatar
    Petru July 28, 2015 at 12:07 pm #

    Thank you verry much sir. 🙂

  16. Avatar
    Manash August 11, 2015 at 6:56 am #

    Hi Jason,

    Nice Post. I like “read it cover to cover or work through all of the tutorials.”.

    Thanks a lot for the suggestions and resources.

  17. Avatar
    Kazem Jahanbakhsh September 4, 2015 at 7:37 am #

    In the last 8 years, I have been actively doing research in the fields of Machine Learning, Data Mining & NLP. Worked on different ML/DM/NLP problems such as: human mobility prediction, online advertising, fraud detection, predicting elections and so on.

    One of the questions that I had at early times was to pick and study a high quality ML/DM books who cover the fundamental concepts/theories of ML/DM. Of course, the area is broad and lots of time you need to read papers to stay updated & learn about the latest results.

    So, for generating the Top ML/DM books, we crawled the web and collected data and signals for 100’s of ML/DM books. Next, we implemented a Ranking algorithm by which we could rank the top ML/DM books.

    Our goal was to rank the books objectively so that the buyers could have a data-driven, objective and fair list of top books. This is an ongoing project where we’ll continue improving the quality of our ranking algorithm.

    You can find the top 16 Machine Learning, Data Mining and Natural Language Processing Books from here:

    http://www.aioptify.com/topmldmbooks.php?utm_source=machinelearningmastery&utm_medium=cpm&utm_campaign=contentpromotion

    Please let us know if you have any suggestions on ranking or questions.

  18. Avatar
    Guy Hudara November 18, 2015 at 1:44 am #

    Hi,
    Thank you for this post.

    I couldn’t download the “Machine Learning Resource Guide”

    Thanks
    H Guy

  19. Avatar
    Paul February 16, 2016 at 4:19 am #

    This was great–thanks! Here’s another Machine Learning Tutorial: https://www.praetorian.com/blog/machine-learning-tutorial

    This post gives an example of machine learning on binary data. If you’re familiar, the author also released a tech challenge on the topic at https://mlb.praetorian.com

  20. Avatar
    Ash March 12, 2016 at 12:51 pm #

    Great blog!

    I am new to the ML field. After Andrew’s class, I started reading these books and I found them quite helpful.

    1) Introduction to statistical learning
    2) Elements of statistical learning
    3) Applied predictive modeling

    I take each book seriously as I read though each line, take notes and research every concept I don’t understand. This journey has led me to Dive into probability and statistics as well as times series analysis.
    ML is so addicting and I just can’t have enough of it. The more I learn The more I want to learn.
    My plan is to start competing in Kaggle in few months. I have been using this blog as a source of motivation and information. Thanks Jason!!

  21. Avatar
    ankit July 14, 2016 at 3:14 am #

    Mchine learning introduction

    https://www.youtube.com/watch?v=gj4GsbkyndY

  22. Avatar
    Bimal July 19, 2016 at 3:12 pm #

    Thank you Jason for your incredible work with this site..
    I am a machine learning enthusiast and have found this blog very resourceful and helpful. I will try my best to follow your suggestion on getting started will learning this vast field of Machine intelligence and learning.
    Hope to get help and valueable informations from you in future as well..

    • Avatar
      Jason Brownlee July 20, 2016 at 5:15 am #

      You’re welcome, I’m glad you found it useful Bimal.

  23. Avatar
    limin October 9, 2016 at 1:45 pm #

    thank you Jason for your sharing. I am a beginer of ML and I’m not familiar with all the languages you list .Could you recommend some books based on C/C++ ? I’m not good at English.Hope I make myself clear.

    • Avatar
      Jason Brownlee October 10, 2016 at 7:40 am #

      Sorry limin, I don’t know good books for c/cpp programmers.

      You wish to take a closer look at underlying libraries like LINPACK or LAPACK and LIBSVM.

  24. Avatar
    sim November 4, 2016 at 1:00 am #

    Thank you!

  25. Avatar
    hapham November 22, 2016 at 8:27 pm #

    Nice share, I also started with Ng Andrew course 6 months ago.

  26. Avatar
    Dr.Sharmi Sankar January 4, 2017 at 7:35 pm #

    Hi Jason,

    I am researcher working on network traffic and i felt your your book on mastery with R was helping a lot to accomplish my task on my analysis on the traffic. Kindly let me know if i could ponder more on LIBSVM/LINEARSVM/SMO and other interesting features in weka…

    • Avatar
      Jason Brownlee January 5, 2017 at 9:17 am #

      I’m glad to hear it Sharmi.

      I’m a big fan of LIBSVM. I have not used it through Weak, sorry.

  27. Avatar
    Krishna June 19, 2017 at 6:04 am #

    Hi,

    I am healthcare consultant and can see lot of scope of predictive analysis/ genome analytics in different healthcare IT solution, I had a background in Microsoft technologies, but from past 3 years I am concentrating more on functional aspects of things. I read few posts and blogs on ML and getting more and more interested to enroll in a course. Would like to start with – “ML course of coursera by AndrewNg”.

    My biggest concern is that I have never worked on programming languages like Matlab or Octave, also don’t have a background on R or Python.

    Could, you please suggest me if I should take up the course and get started with it? I am pretty fine with linear algebra and calculus, though.

    Thanks,
    Krishna

  28. Avatar
    Manchun Kumar September 20, 2017 at 12:45 am #

    Thank you for sharing great resources to learn data science. I am looking for Data Science Training, Some suggest me to join https://www.janbasktraining.com/data-science for Data Science Training. please suggest me This is best for data Science learning or not.

  29. Avatar
    Libby Murphy March 19, 2018 at 4:57 pm #

    You said it was hard to write this article for you but I want to assure you your efforts and time were worth it. Thanks a lot for such helpful information. I truly found it helpful. My specialization is not machine learning but writing different types of paper, like cheap resume and so on, but I’ve become interested in this subject a few months ago. I have a question for you: how much time does it take to learn at least basic stuff if you’re completely new? Time really matters to me. If I want to learn something, I want to become a pro as soon as possible.

    • Avatar
      Jason Brownlee March 20, 2018 at 6:12 am #

      Many of my readers become effective in a few weeks of practice – as in, they can begin adding value in their business.

  30. Avatar
    Eric March 22, 2018 at 5:33 pm #

    Great content! Also, Experfy has a learning track on machine learning, with courses from beginning to advanced level, and also a blog that shares ideas and info about data science and machine learning. Feel free to check it out: )
    https://www.experfy.com/training/tracks/machine-learning-training-certification

  31. Avatar
    Ramu April 5, 2018 at 3:23 am #

    Hi Jason,

    I found the free ML tutorial series on this site to be quite easy to follow and understand: https://greatlearningforlife.com/?s=machine+learning

  32. Avatar
    varun verma February 22, 2019 at 5:44 pm #

    Hi,

    I need your opinion on the book:
    “hands on machine learning with scikit-learn and tensorflow”

    I am going through this book and I would like to know how do you rate this book.

    Thanks in advance.

    Regards,
    Varun

  33. Avatar
    Brij Bhushan April 19, 2021 at 3:01 pm #

    Hi, Great article! So well written -This is very helpful article for everyone. I wish to read more article from you! Thanks for sharing this valuable information!

  34. Avatar
    M K Singh May 21, 2022 at 10:00 pm #

    Hi, Great article! So well written -This is very helpful article for everyone. I wish to read more article from you! Thanks for sharing this valuable information!

    • Avatar
      James Carmichael May 21, 2022 at 11:31 pm #

      Thank you for the feedback and support M K Singh!

Leave a Reply