How to Research a Machine Learning Algorithm

By Jason Brownlee on August 12, 2019 in Machine Learning Algorithms 32

Algorithms are a big part of the field of machine learning.

You need to understand what algorithms are out there, and how to use them effectively.

An easy way to shortcut this knowledge is to review what is already known about an algorithm, to research it.

In this post you will discover the importance of researching machine learning algorithms and the 5 different sources that you can use to accelerate your understanding of machine learning algorithms.

Kick-start your project with my new book Master Machine Learning Algorithms, including step-by-step tutorials and the Excel Spreadsheet files for all examples.

Research Machine Learning Algorithms
Photo by Anders Sandberg, some rights reserved

Why Research Machine Learning Algorithms

You need to understand algorithms to master machine learning.

Machine learning algorithms are not like other algorithms that you may be familiar with like sorting algorithms.

Not only are machine learning algorithms data-dependent, but they are adaptive. Often the heart of a given machine learning algorithm is an optimization process that is stochastic, meaning it has elements of randomness. As such, this makes machine learning algorithms more difficult to analyze and to make hard judgements about best and worst performance.

You need to apply, implement or think deeply about algorithms to understand them.

You can describe how an algorithm works as a mathematical recipe, but to understand it’s behaviours in practice, you must study it in action. You can do this my experimenting on an algorithm, applying it to a lot of problems and distilling out how it behaves and how to expose and exploit these behaviors in the face of different problem types.

Alternatively, the shortcut that you can take is to dive into what other people have understood about the algorithm before you.

You need background on the algorithms which only comes from researching them.

Get your FREE Algorithms Mind Map

Sample of the handy machine learning algorithms mind map.

I've created a handy mind map of 60+ algorithms organized by type.

Download it, print it and use it.

Also get exclusive access to the machine learning algorithms email mini-course.

5 Sources To Use When Researching Algorithms

Researching a machine learning algorithm requires a systematic investigation of the algorithm from multiple sources.

This may sound more scary than it actually is. Your goal is to build up your own consistent understanding of different machine learning algorithms, and a consistent understanding is personal to you and will require collation of interpretation of a given algorithm from multiple sources.

Different sources can be used for different purposes, so you need to pick and choose those sources carefully and purposefully.

Start with a clear idea of why you want to research a given machine learning algorithm, and then pick those sources that can best answer the questions that you have.

There are 5 different sources that you can use in your research of a machine learning algorithm, we will review each in turn.

1. Authoritative Sources

Authoritative sources provide expert interpretations and descriptions of algorithms.

They are useful for getting up to speed on an algorithm fast as the explanations are often rigorous and somewhat standardized, at least within the material.

The descriptions can also be dense, often steeped in mathematics and focused on the theoretical side using academic language. In this way, they can be difficult to penetrate without sufficient background.

Examples of authoritative sources include:

Textbooks such as those used in graduate machine learning courses.
Lecture note and slides, such as those presented during graduate machine learning courses.
Overview papers such as those that make up academic compendiums on a topic.

2. Seminal Sources

Seminal sources are the expert sources and the original descriptions of the algorithms.

Seminal sources are good for getting inside the head of the original author or describer of a machine learning algorithm and teasing out the intent of algorithm parameters and processes.

These sources are almost always academic and theoretical and only occasionally include useful usage information.

Examples of seminal sources include:

Conference papers and journal articles.
Technical reports that might precede or supplement the original publications on the method.

3. Leading Edge Sources

Many algorithms suffer ongoing research. This may take the form of extensions, deeper investigation or even simple application and comparison of the method to other methods.

I call these sources leading edge because they expose useful new and state-of-the-art information about a machine learning algorithm.

Leading edge sources can be used to get a good idea of what problems related to an algorithm are currently being worked on. These may represent interesting or difficult sub-processes within the algorithm of which you can take note.

Often leading edge sources are dense and technical and will require much work on your behalf to interpret the intent of the work and extract salient details that help you better understand the algorithm.

Examples of leading edge sources include:

Conference papers and journal articles.
Conference talks such as plenaries and perhaps workshops.

4. Usage Heuristics Sources

Usage heuristics and best practices are probably the key type of information you are interested in when researching a machine learning algorithm for practical and applied purposes.

Usage heuristic sources provide an expert description for how to use a given machine learning algorithm in practice. They are good for practical usage advice such as parameter configurations, suggested data preparation steps and even advice on how to adapt and scale the algorithm for specific classes of problems.

Often details are missing from these sources that must be inferred or sought by directly contacting authors. Don’t expect to be able to easily reproduce the results from these sources, focus on extracting heuristics that you can use to prompt algorithm usage.

Examples of usage heuristics sources include:

Papers that describe the results from machine learning competitions, like KDD Cup and Kaggle.
“What I did” blog posts and forum posts related to machine learning competitions.
Question and Answer websites such as Cross Validated and other machine learning community sites.
Application conference papers.

5. Implementation Sources

You may be interested in researching an algorithm because you want to implement it. In addition to the other sources listed above, you should consult implementation sources.

These are sources that are prepared by experts or semi-experts that provide implementations of machine learning algorithms as examples, in libraries and tools. The samples may be released under a permissive or open source license for you to learn from.

These sources are good to get ideas on how given machine learning algorithms can be translated into an executable and usable system.

Example of implementation sources include:

Open source projects such as libraries and tools.
Posts on relevant machine learning blogs.
Technical reports prepared by graduate students or research labs.

Often, implementations on blog posts are provided for tutorial and understanding purposes and may not be written for speed or scalability. Open source algorithm implementations you find in libraries and tools are often highly optimized and are not written for readability.

Research is Not Just For Academics

You can research machine learning algorithms. Do not be scared off by the formal academic language and medium of papers and articles.

You do not need to be a PhD research nor a machine learning algorithm expert.

You can read the papers, books and algorithm implementations just as well as anyone.

Often the problem of a difficult to read paper lies with the author and not with you, the reader. It is very hard to write a good technical treatment of an algorithm or research and those good sources are gems when you find them.

Action Steps

In this post you discovered the importance of researching machine learning algorithms and 5 sources that you can use to find the information you need on machine learning algorithms.

The next step is to practice your newfound skills.

Select an algorithm that you want to research.
Consider what you want to know about the algorithm and select the sources that can best answer your questions from the list above.
Systematically research the algorithm. Start with Google Scholar and type in the algorithm name if you are looking for papers. Start with a Google search of GitHub and type in the algorithm name if you are looking for algorithm implementations.

Share what you learn.

32 Responses to How to Research a Machine Learning Algorithm

Ming Huang October 20, 2015 at 1:17 pm #

The five resource are very useful and it covers how to learn ML theory, application and implementation.

Thanks a lot!

Reply
- Jason Brownlee December 23, 2015 at 9:36 am #
  
  Thanks Ming, you’re very welcome!
  
  Reply
- Sarfaraz December 20, 2020 at 5:47 pm #
  
  Thanks ming your web is very informative and I appreciate your hard work
  
  Reply
  - Jason Brownlee December 21, 2020 at 6:36 am #
    
    You’re welcome.
    
    Reply
Vikram Bajaj June 27, 2016 at 5:31 pm #

Fantastic step-by-step approach! Very informative 🙂
Thank you!

Reply
- Jason Brownlee June 28, 2016 at 6:11 am #
  
  I’m glad you it useful Vikram.
  
  Reply
Rick Shah November 3, 2016 at 7:06 am #

So what do you do if you don’t have access to the paper you need due to paywalls…?

Reply
- Jason Brownlee November 3, 2016 at 8:04 am #
  
  Great question Rick.
  
  Here are my 5 off-the-cuff tactics for getting any paper:
  
  – Search for the home pages of the authors. May authors put a pre-print on their home page.
  – Search arxiv, same reason as above.
  – Search google scholar, it can often find the pre-prints for you. click the button “All n versions” to see everything google can see, often there is a PS or PDF in there.
  – Email the authors. They will always send you a copy – they just want to be read.
  – Email researchers in the field, they will often have access through the paywall or have the paper already.
  
  Reply
  - Shefeek April 30, 2019 at 10:38 am #
    
    Hi Jason,
    
    Thanks for breaking it down so clearly. What’s your take on paperswithcode.com?
    
    Reply
    - Jason Brownlee April 30, 2019 at 2:27 pm #
      
      I’m not familiar with it sorry, what is it exactly?
      
      Reply
Tameru December 6, 2016 at 7:48 am #

Hi Jason, all your posts are really interesting. What I like from your tutorials is it’s easy to understand. Thank you! I appreciate it.

Reply
- Jason Brownlee December 6, 2016 at 9:54 am #
  
  Thanks Tameru!
  
  Reply
  - Suleman Khan January 17, 2018 at 9:47 pm #
    
    Great Sir,
    Very very helpful and great for those who want to learn from scratch.
    One again great contribution.
    
    Reply
    - Jason Brownlee January 18, 2018 at 10:08 am #
      
      Thanks, I’m glad to hear that.
      
      Reply
Monique January 27, 2018 at 5:31 am #

Thanks for the post!
I have a question. When do you feel that what you found is enough? I mean, I start browsing and then I have several tabs and sometimes I struggle to filter and process all the content. There are so many resources out there(which is great), but it is difficult to know how much you should read. Can I ask you how do you approach this?
Thanks,
Monique

Reply
- Jason Brownlee January 27, 2018 at 6:00 am #
  
  I often have a reading stages and a writing stage. I stop reading when I feel like I am not getting anything new.
  
  Does that help?
  
  Reply
HH February 7, 2018 at 11:18 am #

what are the autoencoders in deep learning in general and how does an autoencoder work? please

Reply
- Jason Brownlee February 8, 2018 at 8:20 am #
  
  Good question, I hope to cover them in detail in the future.
  
  Reply
Aniket Saxena February 8, 2018 at 12:52 am #

Hello Jason,

Have you ever made your own machine learning algorithm? If yes, how did you plan your path to got to that new machine learning algorithm?

Can you please give me an idea or refer some stuff because after reading some math behind these algorithms from “Elements of Statistical Learning” I also get to have an interest to build my own ML algorithm?

I do know that it’s very hard but some idea can help me a lot.

Reply
- Jason Brownlee February 8, 2018 at 8:30 am #
  
  I did and I think it is very hard to do well. I don’t think I did it well. I do not think I can give effective advice sorry.
  
  Reply
Benya Jamiu May 17, 2018 at 11:57 pm #

Its nice reading your advice on how to master the algoriths, bcos ive been confused instead of convincing until i read you write up and i’m on the way of choosing but i’m an hungry lion who loves many good things .All algorithms are good but time consuming in implemeting all..
Please i tried to download the list you said and i sent my email through the box but no reply for link to downlaod the link, kindly get back to me

Reply
- Jason Brownlee May 18, 2018 at 6:25 am #
  
  I’m sorry to hear that. Perhaps check all of your email folders or try a different email address?
  
  If you still cannot find it, use the contact page to email me.
  
  Reply
Bindeshwar Singh Kushwaha August 14, 2018 at 7:27 pm #

Article gives very good overview that how to start study and research machine learning. Thank you Jason.

Reply
- Jason Brownlee August 15, 2018 at 5:58 am #
  
  Thanks.
  
  Reply
Sandhya November 23, 2018 at 9:27 pm #

Very interesting and easy to understand articles, Jason. Thanks a lot. It’s a big help.
I have narrowed down and read few research papers and decided to learn neural networks for my research. But research papers are not so easy to get me step by step information. What do you suggest.

Reply
- Jason Brownlee November 24, 2018 at 6:31 am #
  
  Perhaps start with a textbook, such as the “deep learning” book.
  
  Reply
ahmet April 17, 2019 at 10:39 pm #

I’m a graduate student mainly working on computer security some time and machine learning for 1,5 year. Most of the time it is better reading blogs (machinelearningmastery.com is best) and books than academic papers. Books can give you mathematical insight for well-known algorithms, blogs and application conference papers can provide you application specific model construction information. I don’t recommend mainly look at journal papers for learning machine learning because it can be overwhelming for new people like me.

Reply
- Jason Brownlee April 18, 2019 at 8:45 am #
  
  Thanks for sharing.
  
  Reply
Omar Alsabbagh March 21, 2020 at 4:08 pm #

Thank you,
Great article as always

Reply
- Jason Brownlee March 22, 2020 at 6:51 am #
  
  Thanks!
  
  Reply
Nawal May 24, 2020 at 4:34 am #

Jason, where will you put LSTM?

Reply
- Jason Brownlee May 24, 2020 at 6:16 am #
  
  Put where?
  
  Reply

Navigation

How to Research a Machine Learning Algorithm

Why Research Machine Learning Algorithms

Get your FREE Algorithms Mind Map