Self-Study Guide to Machine Learning

There are lots of things you can do to learn about machine learning.

There are resources like books and courses you can follow, competitions you can enter and tools you can use.

In this post I want to put some structure around these activities and suggest a loose ordering of what to tackle when in your journey from programmer to machine learning master.

Four Levels of Machine Learning

Consider four levels of competence in machine learning. This is a model to help us think about the resources and activities available and when a good time to tackle them might be.

  1. Beginner
  2. Novice
  3. Intermediate
  4. Advanced

I want to separate beginner from novice here because I want to show that an absolute beginner (a programmer with an interest in the field) has a path before them if they choose.

We are going to tour through each of these four levels and look at resources and activities that can help someone at one level learn more and progress their understanding and skill levels.

The breakdown is just a suggestion, and it is very likely that some activity or resource at a level before or after can be very useful and appropriate at a given level in the breakdown.

I think the overall structure is useful, I’m keen to hear what you think, leave a comment below with your thoughts.

Four Levels of Machine Learning

Credited to pugetsoundphotowalks, some rights reserved

Beginner

A beginner is a programmer with an interest in machine learning. They may have started to read a book, Wikipedia page, or taken a few lessons in a course, but they don’t really “get it” yet. They’re frustrated because the advice they are getting is for intermediates and advanced levels.

Beginners need a gentle introduction. Away from code and textbooks and courses. They need the whys and whats and hows pointed out first to lay the foundation for novice-level material.

Some activities and resources for the absolute beginner are:

Novice

A novice has had some contact with the field of Machine Learning. They have read a book or taken a course. They know they are interested and they want to know more. They are starting to get it and want to start to get things done.

Novices need something to do. They need to be put into action to have the material grounded and integrated into existing knowledge structures like the programming languages they know or the problems they are used to solving.

Some activities and resources for the novice are:

  • Complete a Course: Take and complete a course like the Stanford Machine Learning course. Take a lot of notes, complete the homework if possible, ask a lot of questions.
  • Read some Books: Not textbooks, but friendly books like those listed above targeted at beginner programmers.
  • Learn a Tool: Learn to drive a tool or library like Scikit-Learn, WEKA, R or similar. Specifically, learn how to use an algorithm you have read or learned about in a book or course. See it in action and get used to trying things out as you learn them.
  • Write Some Code: Implement a simpler algorithm like a perceptron, k-nearest neighbour or linear regression. Write little programs to demystify methods and learn all the micro-decisions required to make it work.
  • Complete Tutorials: Follow and complete tutorials. Start building up a directory of small projects that you have completed with datasets, scripts and even source code you can look back on, read and think about.

Intermediate

A novice has read some books and completed some courses. They know how to drive some tools and have written a bunch of code both implementing simple algorithms and completing tutorials. An intermediate is breaking out on their own, devising their own projects to learn new techniques and interacting and learning from the greater community.

The intermediate is learning how to implement and wield algorithms accurately, competently and robustly. They are also building the skills of spending a lot of time with data up front, cleaning, summarizing and thinking about the types of questions that it can answer.

Some activities and resources for the intermediate are:

  • Small Projects: Devise small programming projects and experiments where machine learning can be used to solve a problem. This is like designing and executing your own tutorials in order to explore a technique you’re interested in. You may implement an algorithm or link to a library that provide the algorithm. Learn more about small projects.
  • Data Analysis: Get used to exploring and summarizing datasets. Automate reports, know which tools to use when, and look for data you can explore, clean, and on which you can practice techniques and communicate something interesting.
  • Read Textbooks: Read and internalize textbooks on machine learning. This may very well require skills to grok mathematical descriptions of techniques and acknowledging formalisms that describe classes of problems and algorithms.
  • Write Plugins: Write plugins and packages for open source machine learning platforms and libraries. This is an exercise in learning how to write robust and production level algorithm implementations. Use your own plugins on projects, ask for code reviews from the community and work to get the code included into the platform if possible. Getting feedback and learning is the goal.
  • Competitions: Participate in machine learning competitions, such as those associated with conferences or offered on platforms like Kaggle. Get involved in discussions, ask questions, learn how other practitioners are approaching the problem. Add to your repository of projects, methods and code from which you can draw.

Advanced

An advanced practitioner has written a lot of code either integrating machine learning algorithms or implementing algorithms themselves. They may have competed in competitions or written plugins. They have read the textbooks, completed the courses and have a broad knowledge of the field, as well as a deep knowledge on a few key techniques of which they prefer.

The advanced practitioner builds, deploys and maintains production systems that use machine learning. They keep abreast of new developments in the fields and eagerly seek out and learn the nuances of a method and tips passed around from other frontline practitioners like themselves.

Some activities and resources for the advanced practitioner are:

  • Customizing Algorithms: Modify algorithms to meet their needs, which may involve implementing customizations outlined in conference and journal papers for similar problem domains.
  • New Algorithms: Devising entirely new methods based on the underlying formalisms to meet the challenges they encounter. It is more about getting the best results possible rather than advancing the frontier of the field.
  • Case Studies: Read and even recreate case studies completed for machine learning competitions and by other practitioners. These “how I did it” papers and posts are usually chock full of subtle pro tips for data preparation, feature engineering and technique usage.
  • Methodology: Systemizing of processes, whether formally or for themselves. They have a way to approach problems and get results at this point and they are actively looking for ways to further refine and improve that process with tips, best practices and new and better techniques.
  • Research: Attending conferences, reading research papers and monographs, having conversations with experts in the field. They may write up some of their work and submit it for publication, or just drop it in a blog post and get back to work.

Mastery is continuous, the learning does not end. One could pause and detour at any point along this journey and become the “competition guy” or the “pro library guy“. In fact, I expect such detours to be the norm.

This breakdown could be read as a linear path of the technicians journey from beginner to advanced level, it’s intentionally programmer centric. I’m keen to hear criticisms of this reading so that I can make it better. This breakdown is just my suggestions of the types of activities to tackle if you find yourself hungering for more at a given level.

So what level are you and what are you going to take on next? Leave a comment!

UPDATE: Continue the discussion on Reddit.

32 Responses to Self-Study Guide to Machine Learning

  1. Avatar
    jasonb December 22, 2013 at 7:50 pm #

    Join the discussion of this guide on reddit.com/r/MachineLearning.

  2. Avatar
    Hamed Abdollahpour May 8, 2014 at 5:45 pm #

    Thanks. It’s been really useful for me.

    • Avatar
      jasonb May 8, 2014 at 7:02 pm #

      Great to here that @Hamed, thanks.

  3. Avatar
    Ruma Sinha June 20, 2016 at 2:58 pm #

    Hi,

    I am Oracle ERP technical consultant for last 14 years and did my masters in Mathematics 1997-2002.

    I consciously made switch to get into analytics and started studying for the same.

    I started with R and Machine Learning. this post will help me to plan my path for Machine Learning.

    Thanks

  4. Avatar
    tarun July 22, 2016 at 11:01 am #

    Hi Jason, a nice note to put the light on one of the advanced and fastly growing subject. My question here is from Java back ground I don’t have knowledge of Python, but mostly I see ML books use python. Please suggest what should I do to have expertise in ML. I’m a novice with no backgrpund in ML.

  5. Avatar
    Yunhan September 28, 2016 at 12:52 pm #

    Hi Jason, I was wondering if you could be so kind to provide some sources for ‘how I did it’ papers? As far as I know, there are some on Kaggle, and there is Netflix competition, but I want more of them. Thanks in advance!

  6. Avatar
    Raghavendra. R. Vastrad December 15, 2016 at 4:24 pm #

    Hi Jason,

    I do work as Sr. Linux System Administrator.
    Love to take up Machine Learning as career path. But I am not sure how to proceed and learn about Machine Learning.
    Could you please guide me in this regard.

    Regards: Raghavendra. R. Vastrad.

  7. Avatar
    Asif Ameer January 3, 2017 at 3:54 am #

    Great Explanation dear Jason Brownlee, indeed you have done a Great Job….

  8. Avatar
    Francis January 15, 2017 at 1:00 pm #

    Thanks,Jason

    I am quite a new beginner who has interest in Machine Learning. What I am trying to do is to relate the machine learning with my own filed. I am not at the Internet field. I just want to using my instrument data to extract some deeper information that cannot be got out directly from the instrument principle. Also, I tried to use these deeper information to enhance the resolution or capacity of the instrument. Hope you can offer some advice.

  9. Avatar
    Ravindra May 15, 2017 at 11:39 pm #

    Hi Jason,

    I am planning to learn machine learning,what would be the best program language (python/R) we can use ?

    Thanks

  10. Avatar
    Azhaar August 4, 2017 at 2:56 pm #

    Great post Jason. I can put myself in the novice level. Can you please share some resources to start learning/practicing automating reports? Many thanks.

    • Avatar
      Jason Brownlee August 4, 2017 at 3:46 pm #

      Great!

      Perhaps this post will help you define your prediction problem:
      https://machinelearningmastery.com/how-to-define-your-machine-learning-problem/

      • Avatar
        Azhaar Hussain August 4, 2017 at 3:50 pm #

        I was referring to automating reports part you mentioned in the intermediate level “Data Analysis: Get used to exploring and summarizing datasets. Automate reports, know which tools to use when, and look for data you can explore, clean, and on which you can practice techniques and communicate something interesting.”

        • Avatar
          Jason Brownlee August 5, 2017 at 5:43 am #

          Ah, I see, sorry for my confusion.

          This is very specific to the problem/data. I would recommend looking for reports where you are or could add a forecast or classification that would add value.

          Alternately, you could take an existing report that does some stats, and re-write it using R or Python to get familiar with the libs and basic stats.

          I hope that helps.

  11. Avatar
    Felix January 17, 2018 at 8:49 am #

    Hi @Jason, great article, I have started a blog about tech, want to post this article on my blog with your permission and to avoid any copyright issues the website will be cited as a source. let me know if this is okay with you. Thanks.

  12. Avatar
    Jesús Martínez January 31, 2018 at 12:36 am #

    Great post, Jason. It is very useful to have a clear outline of the stages one should aim for in order to gain mastery in the Machine Learning field. As usual, your posts are full of wisdom. Congratulations! 🙂

  13. Avatar
    Taylor Bishop September 5, 2018 at 12:11 am #

    Thanks for helping me learn more about machine learning. I had no idea that someone who is considered to be an intermediate can make their own projects to learn new techniques. This sounds useful if these techniques could improve their programming or even develop the basics for new innovations in the future.

    • Avatar
      Jason Brownlee September 5, 2018 at 6:41 am #

      I’m happy it helped. You don’t need permission, dive in and make a mess! Learn things!

  14. Avatar
    Muhammad iqbal April 27, 2019 at 12:02 am #

    HI! this is MUHAMMAD IQBAL from INDIA completed masters in Computer Application.
    I was thinking to do PG again in AI, ML, and DL. But, after reading your blog, I decided to not to go to the time-consuming curriculum and started my own way of learning.
    I have completed the course offered by Jose Partilla on udemy about Python for machine learning and Data Science that gave me a lot brief about ML and Data Science.
    and now, I am learning Data Science by UCSanDiego: Data Science Course.
    Your Blog is awesome.
    Keep it UP!
    Keep being Awesome! I will be saying Congratulations!
    A vigorous teacher, Quality unmatched.

    thanks, thanks, and thanks a lot!

    • Avatar
      Jason Brownlee April 27, 2019 at 6:33 am #

      Thanks, and good luck!

      I’m here to offer advice and help, if I can.

  15. Avatar
    Jeetech Academy March 7, 2022 at 2:08 pm #

    I have to thank you for the time i spent on this especially great reading !! I really liked each part and also bookmarked you for new information on your site.

  16. Avatar
    Lukas July 6, 2022 at 8:10 am #

    I just wanted to thank you for putting together these resources for someone with no background and no clear idea where to start learning.

    I did my M.Sc in structural biology & biochemistry, and with the advent of AlphaFold2 and the incredible work going on at DeepMind, I feel my field rapidly shrinking and my expertise going obsolete in the face of ML constructs that can reduce the work of an entire graduate thesis to mere minutes of simulation.

    Exciting and turbulent times.

    • Avatar
      James Carmichael July 7, 2022 at 6:43 am #

      Thank you Lukas for the support and feedback! We greatly appreciate it!

Leave a Reply