What is Machine Learning?

You’re interested in Machine Learning and maybe you dabble in it a little.

If you talk about Machine Learning with a friend or colleague one day, you run the risk of someone actually asking you:

So, what is machine learning?

The goal of this post is to give you a few definitions to think about and a handy one-liner definition that is easy to remember.

We will start out by getting a feeling for the standard definitions of Machine Learning taken from authoritative textbooks in the field. We’ll finish up by working out a developers definition of machine learning and a handy one-liner that we can use anytime we’re asked: What is Machine Learning?

Authoritative Definitions

Let’s start out by looking at four textbooks on Machine Learning that are commonly used in university-level courses.

These are our authoritative definitions and lay our foundation for deeper thought on the subject.

I chose these four definitions to highlight some useful and varied perspectives on the field. Through experience, we’ll learn that the field really is a mess of methods and choosing a perspective is key to making progress.

Mitchell’s Machine Learning

Tom Mitchell in his book Machine Learning provides a definition in the opening line of the preface:

The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience.

I like this short and sweet definition and it is the basis for the developers definition we come up with at the end of the post.

Note the mention of “computer programs” and the reference to “automated improvement“. Write programs that improve themselves, it’s provocative!

In his introduction he provides a short formalism that you’ll see much repeated:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Don’t let the definition of terms scare you off, this is a very useful formalism.

We can use this formalism as a template and put E, T, and P at the top of columns in a table and list out complex problems with less ambiguity. It could be used as a design tool to help us think clearly about what data to collect (E), what decisions the software needs to make (T) and how we will evaluate its results (P). This power is why it is oft repeated as a standard definition. Keep it in your back pocket.

Elements of Statistical Learning

The Elements of Statistical Learning: Data Mining, Inference, and Prediction was written by three Stanford statisticians and self-described as a statistical framework to organize their field of inquiry.

In the preface is written:

Vast amounts of data are being generated in many fields, and the statisticians’s job is to make sense of it all: to extract important patterns and trends, and to understand “what the data says”. We call this learning from data.

I understand the job of a statistician is to use the tools of statistics to interpret data in the context of the domain. The authors seem to include all of the field of Machine Learning as aids in that pursuit. Interestingly, they chose to include “Data Mining” in the subtitle of the book.

Statisticians learn from data, but software does too and we learn from the things that the software learns. From the decisions made and the results achieved by various machine learning methods.

Pattern Recognition

Bishop in the preface of his book Pattern Recognition and Machine Learning comments:

Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science. However, these activities can be viewed as two facets of the same field…

Reading this, you get the impression that Bishop came at the field from an engineering perspective and later learned and leveraged the Computer Science take on the same methods. Pattern recognition is an engineering or signal processing term.

This is a mature approach and one we should emulate. More broadly, regardless of the field that lays claim to a method, if it suits our needs by getting us closer to an insight or a result by “learning from data”, then we can decide to call it machine learning.

An Algorithmic Perspective

Marsland provides adopts the Mitchell definition of Machine Learning in his book Machine Learning: An Algorithmic Perspective.

He provides a cogent note in his prologue that motivates his writing the book:

One of the most interesting features of machine learning is that it lies on the boundary of several different academic disciplines, principally computer science, statistics, mathematics, and engineering. …machine learning is usually studied as part of artificial intelligence, which puts it firmly into computer science …understanding why these algorithms work requires a certain amount of statistical and mathematical sophistication that is often missing from computer science undergraduates.

This is insightful and instructive.

Firstly, he underscores the multidisciplinary nature of the field. We were getting a feeling for that from the above definition, but he draws a big red underline for us. Machine Learning draws from all manner of information sciences.

Secondly, he underscores the danger of sticking to a given perspective too tightly. Specifically, the case of a the algorithmist who shies away from the mathematical inner workings of a method.

No doubt, the counter case of the statistician that shies away from the practical concerns of implementation and deployment is just as limiting.

Venn Diagram

Drew Conway created a nice Venn Diagram in September 2010 that might help.

In his explanation, he comments: Machine Learning = Hacking + Math & Statistics

Data Science Venn Diagram

Data Science Venn Diagram. Credited to Drew Conway,  Creative Commons licensed as Attribution-NonCommercial.

He also describes the Danger Zone as Hacking Skills + Expertise.

Here, he is referring to those people that know enough to be dangerous. They can access and structure data, they know the domain and they can run a method and present results, but don’t understand what the results mean. I think this is what Marsland may have been hinting at.

Developers Definition of Machine Learning

We now turn to the need to break all of this down to nuts and bolts for us developers.

We first look at complex problems that resist our decomposition and procedural solutions. This frames the power of machine learning. We then work out a definition that sits well with us developers that we can use whenever we’re asked, “So, What is Machine Learning?” by other developers.

Complex Problems

As a developer, you will eventually encounter classes of problems that stubbornly resist a logical and procedural solution.

What I mean is, there are classes of problems where it is not feasible or cost-effective to sit down and write out all the if statements needed to solve the problem.

Sacrilege!” I hear your developer’s brain shout.

It’s true.

Take the every-day case of the decision problem of discriminating spam email from non-spam email. This is an example used all the time when introducing machine learning. How would you write a program to filter emails as they come into your email account and decide to put them in the spam folder or the inbox folder?

You’d probably start out by collecting some examples and having a look at them and a deep think about them. You’d look for patterns in the emails that are spam and those that are not. You’d think about abstracting those patterns so that your heuristics would work with new cases in the future. You’d ignore odd emails that will never be seen again. You’d go for easy wins to get your accuracy up and craft special things for the edge cases. You’d review the email frequently over time and think about abstracting new patterns to improve the decision making.

There’s a machine learning algorithm in there, amongst all that, except it was executed by you the programmer rather than the computer. This manually derived hardcoded system would only be as good as the programmer’s ability to extract rules from the data and implement them in the program.

It could be done, but it would take a lot of resources and be a maintenance nightmare.

Machine Learning

In the example above, I’m sure your developer brain, that part of your brain that ruthlessly seeks to automate, could see the opportunity for automating and optimizing the meta-process of extracting patterns from examples.

Machine learning methods are this automated process.

In our spam/non-spam example, the examples (E) are emails we have collected. The task (T) was a decision problem (called classification) of marking each email as spam or not, and putting it in the correct folder. Our performance measure (P) would be something like accuracy as a percentage (correct decisions divided by total decisions made multiplied by 100) between 0% (worst) and 100% (best).

Preparing a decision making program like this is typically called training, where collected examples are called the training set and the program is referred to as a model, as in a model of the problem of classifying spam from non-spam. As developers, we like this terminology, a model has state and needs to be persisted, training is a process that is performed once and is maybe rerun as needed, classification is the task performed. It all makes sense to us.

We can see that some of the terminology used in the above definitions does not sit well for programmers. Technically, all the programs we write are automations, commenting that machine learning automatically learns is not meaningful.

Handy One-liner

So, let’s see if we can use these pieces and construct a developers definition of machine learning. How about:

Machine Learning is the training of a model from data that generalizes a decision against a performance measure.

Training a model suggests training examples. A model suggests state acquired through experience. Generalizes a decision suggests the capability to make a decision based on inputs and anticipating unseen inputs in the future for which a decision will be required. Finally, against a performance measure suggests a targeted need and directed quality to the model being prepared.

I’m no poet, can you come up with a more accurate or more succinct developers definition of Machine Learning?
Share your definition in the comments below.

Further Reading

I’ve linked to resources throughout this post, but I have listed some useful resources below if you thirst for more reading.

Books

The following are the four textbooks from which definitions were drawn.

Also, Drew Conway has a book in collaboration with John Myles White that is practical and fun to read titled Machine Learning for Hackers.

Question and Answer Sites

There are some interesting discussions on Q&A websites about what exactly machine learning is, below are some picks.

I’ve thought hard about all of this, and my definition is coloured by the books I’ve read and the experiences I’ve had. Let me know if it’s useful.

Leave a comment and let us all know how you understand the field. What is Machine Learning to you? Do you know of any further resources we could fall back to?
Let me know in the comments below.

70 Responses to What is Machine Learning?

  1. Avatar
    Vikash January 2, 2014 at 12:50 pm #

    Thanks for collecting the quotes and coming up with your own in the process. Insightful and enjoyed it. Thanks for posting.

    • Avatar
      James Carmichael January 7, 2022 at 7:55 am #

      You very welcome Vikash!

  2. Avatar
    qnaguru February 17, 2014 at 12:34 am #

    Wonderful introduction to Machine Learning – Programmers get that!

  3. Avatar
    hugo June 30, 2015 at 1:41 am #

    Muito bom ter pessoas que gostem de compartilhar conhecimento.Obrigado

  4. Avatar
    hugo June 30, 2015 at 1:50 am #

    Só mais uma coisa.Qual a relação entre aprendizado de máquina e estatística?

    • Avatar
      Lucas Lopes January 19, 2018 at 10:33 am #

      Como descrito na parte do diagrama de Venn, é possível desenvolver aplicações que aprendem através de dados, sem necessariamente dominar estatística. Mas as chances de criar um modelo que não extrai a real informação dos dados, é alta.

      • Avatar
        Jason Brownlee January 20, 2018 at 8:15 am #

        Not sure I agree.

        I think you can drive a car without understanding how an engine works, or solve a business problem with code without understanding the theory of computation.

        • Avatar
          steveK November 21, 2019 at 5:43 am #

          True, but sometimes understanding helps. I “hacked” my Kia Sorento to get incredible mileage over hilly terrain. It was the last two of eight hours of driving that day to get my daughter to a writer’s retreat. I drove the plateau and slight up hills at 45 mph, a bit below the speed limit. When I got near where to go downhill, I slowed down and used the clutchless shift to go down. At one point, a truck was behind me, and that was just where there was a second driving lane, so I let him by. Otherwise, I found that by downshifting on the hills, I would approach the curves at a more reasonable speed, without braking, and I wouldn’t overcompensate on the curve. I got 26.5 mpg on an AWD that is rated for 24 highway. Part of my success was gravity, but using engine friction was part of my success! So, that’s not a lot of understanding of the engine, but I did understand the system.

          • Avatar
            Jason Brownlee November 21, 2019 at 6:13 am #

            Yes, but you did not need that understanding to get started, or even drive across country.

            It’s a great next step once the basics are covered and value is being delivered.

  5. Avatar
    Idrees April 2, 2016 at 1:29 am #

    The stress of PhD research and research assistantship had taken me off- guard. I took my time ti go through your post today. Honestly this piece lighten my head to gift of knowledge. Thank Dr Jason.

  6. Avatar
    AnalyticAscent May 7, 2016 at 3:51 pm #

    This is good to know, been struggling to explain to my family what my career path is in terms they can understand 🙂

  7. Avatar
    Yap May 11, 2016 at 8:51 pm #

    Here is mine handy one-liner: ML is a decision problem that needs to be explored from data against a measure outcome.

  8. Avatar
    Jim Kitzmiller July 19, 2016 at 4:50 am #

    Machine learning is the art and science of creating computer software that gets more accurate results after being used repeatedly.

  9. Avatar
    Yasmina October 20, 2016 at 8:04 am #

    it’s very well explain Dr Jason, I work on prediction of crbonation depth of concrete by using artificial neural natwork, Ive prepared algorithm with collaboration of my collegue, but computer programming is not easy for civil engineer. I’ll read your course in future, thanks

  10. Avatar
    Van-Hau Nguyen November 7, 2016 at 5:17 pm #

    Thanks for the great posts. You have been doing a lot of interesting works.
    How could you keep your enthusiasm in ML?

    • Avatar
      Jason Brownlee November 8, 2016 at 9:51 am #

      Thanks Van-Hau Nguyen.

      It keeps me interested because every day is a new challenge. There are no “right” answers and there are always more things to learn and improve upon.

      Also, I love to help beginners get started and see how easy it is to apply.

  11. Avatar
    Julien March 22, 2017 at 1:22 pm #

    This great! As a non-programmer, my one-liner might be something like: Machine Learning is using data to create a model and then use that model to make predictions.

    Am I understanding it correctly or is there something missing in my one-liner?

  12. Avatar
    Gokul Krishnaraj March 28, 2017 at 11:08 pm #

    Solid ML material for references, started with a search on predictive analytics query and I am here motivated to look all the way to the 101 of your posts. Great work, Thanks.

  13. Avatar
    Bolanle April 14, 2017 at 12:12 am #

    my one-liner will be making a better prediction by using computer algorithms to train data for maximum accuracy.

  14. Avatar
    Avornyo Charles May 23, 2017 at 5:54 am #

    Thank for very much! I love it.
    I will like to know if you need to be a very good python programming in other to use machine learning techniques.
    I realize, when I was doing my Masters thesis. My supervise asked to implement a model my self and I needed to modify a package in python to make the model work. However, I struggle allot because of the fact that, I am not good at object oriented programming in python.
    I like to do a PhD in Machine Learning applied to Health. However, I am current thinking if I will survive because of my mediocre programming skills.
    What do you think I should do?

    • Avatar
      Jason Brownlee May 23, 2017 at 8:00 am #

      I do not think you need to be an excellent programmer to be able to deliver useful and valuable results with applied machine learning.

      See this post:
      https://machinelearningmastery.com/machine-learning-for-programmers/

      Generally, I would recommend you focus on learning how to get good at working through predictive modeling problems end to end and delivering a result using libraries like sklearn and tools like Weka.

    • Avatar
      Rafael Espericueta August 23, 2017 at 8:53 pm #

      Practice your Python programming skills, and they will improve.
      And it’s even fun! 🙂

  15. Avatar
    Phillip June 21, 2017 at 6:27 am #

    Thank you Jason,
    I’m just starting my journey in ML and your articles are very enlightening and easy to understand.

  16. Avatar
    Ankit Mistry August 23, 2017 at 12:30 pm #

    Checkout this video on layman understanding of machine learning :

    https://www.youtube.com/watch?v=RaDFiMd-Amg

  17. Avatar
    Gaurav Jain December 8, 2017 at 3:14 am #

    Thanks for such a beautiful blog! 🙂

  18. Avatar
    Jesús December 13, 2017 at 2:42 am #

    What a great article, Jason! Congratulations!

  19. Avatar
    Ayyappan February 6, 2018 at 7:30 am #

    Some time i think in the traditional programing, for a developers “program” is the primary focus but but in Machine learning program the focus shift to data.
    A program work like the same no matter how much / long data is tunneled through, but the machine learning programs becomes smarter as much/long we tunnel data through.

  20. Avatar
    Pravin February 27, 2018 at 12:46 am #

    Thanks for the lovely introduction!

  21. Avatar
    Shravan May 1, 2018 at 11:53 am #

    Wonderful kick start to understand machine learning covering lot of material. I think with experience it makes more sense of what is all said.

  22. Avatar
    Christopher July 3, 2018 at 4:46 am #

    Very nice piece!!!

    You asked for feedback:

    Your final definition seems to me to lack any reference to computers or programming. Though implied for those who think in computer jargon, I’d add it in: “Machine Learning is computer training of a model from data that generalizes a decision against a performance measure.”

    Then, I’d ask, is this limited to labeled data? A pre-existing model? If not, wouldn’t it be, “training TO a model”? I think that would include both cases.??

    Finally, in Conway’s Venn Diagram, I see no mention of ethics. This may be the biggest danger for mechanics (developers) and for the future, and deserves mention in every article on AI. IMHO Thanks!

  23. Avatar
    Jesus Suniaga July 12, 2018 at 5:27 am #

    My concept is that machine learning is a field to create useful mathematical models from data. (the word useful involves measure )

  24. Avatar
    Efstathios Chatzikyriakidis December 10, 2018 at 8:29 pm #

    “Machine Learning (ML) is the science of learning from seen data in order to create models that will either extract from it hidden information or recognize, generate and predict unseen data.”

    How do you think?

    It leads to supervised / unsupervised methods, discriminative / generative models, classification / regression tasks.

  25. Avatar
    Sridhar February 1, 2019 at 10:54 pm #

    Thank You, Jason. I am new to Data Science and was looking for a better way to learn Machine Learning. Your blog is nice and actually has shown me a well-structured way to start looking into ML.

    My definition is, Machine Learning is the science of generalizing a model based on the data available and used that model to predict future patterns.

    What do you think?

  26. Avatar
    Rushil Patel May 24, 2019 at 7:15 am #

    Hi, read your machine blog. It’s really nice.
    I have also started writing a blog on machine learning. Can you please check it and give feedback so I can improve.
    link: https://learn-ml.com/

  27. Avatar
    Debesh Choudhury September 4, 2019 at 12:31 pm #

    Simple and very good introduction of ML indeed .. I like Jim Kitzmiller’s comment – “Machine learning is the art and science of creating computer software that gets more accurate results after being used repeatedly”.

  28. Avatar
    Abdul-Aziz January 6, 2020 at 3:39 am #

    I really discover how fascinating Machine learning is , After reading your blog.
    Thanks for this amazing introduction.

  29. Avatar
    Sumit Das April 22, 2020 at 3:33 pm #

    Great Post

  30. Avatar
    Prabuddha Dissanyakae May 11, 2020 at 11:21 am #

    Great stuff, I just wonder your thinking towards the others, its amazing.

  31. Avatar
    Al_Hasib November 9, 2020 at 4:01 pm #

    Well write Brother…

  32. Avatar
    Fernando Augusto Deheza Zambrana February 26, 2021 at 1:43 pm #

    Las definiciones, como aproximaciones relativas, ofrecen puntos de vista diversos que constituyen información importante para el propósito de construir un modelo conceptual multidimensional. Gracias por compartir.

  33. Avatar
    Edy Gianez Silva June 12, 2022 at 7:52 pm #

    Hi, Jason! How are you?

    It’s been two years since i decided to mess with data science and only now that i find your lessons i think that i find a way to go through that

    So, probably you’ll see me quite a lot here 🙂

    My definition of ML is:

    “ML is when we use a set of resources (computational and theoric) to build a tool that help us to make decisions in face of complex problems”

    I try to address the following concerns with this definition:

    Complex problems
    – the reason why we need a machine
    – sets the boundaries of the learning space

    Computational resources
    – We basically (and roughly speaking) use computers to store and manipulate data. When trying to make a a computer learn, we elevate these fundamental things to the higher power
    – store data ->big data -> distribuited storage, etc.
    – manipulate data -> use of gpu, parallel processing, quantic processing, etc
    – better algorithms to handle computations

    Theoric resources
    – Reveals the multidisciplinarity of the field: math, stats, information theory, software eng, all guys comes to help us to learn from data

    Build a tool
    – Machine learning results in a model
    – A model is made by a representation of the problem we’re handling and that will be evaluate and optimize (i’m thinking in Palacio’s framework), once it’s an approximate picture of the real problem-world
    – We can use severel techniques to figure out what is the hidden structure that lays behind the data set (our problem-world distillied in numbers and texts and images, etc).
    – We can call these techniques learning algorithms, objective functions and optimization functions

    Make a decision
    – the goal of all this is to infer a response that guide us thorugh a better choice when a new and complex situation is presented to us
    – our model will help us to make that generalization from a known situation to this new one

    Sorry to take so much space,

    Best regards

    Edy

    • Avatar
      James Carmichael June 13, 2022 at 11:23 am #

      Hi Edy…please narrow your query to a single question so that we may better assist you.

      • Avatar
        Edy June 28, 2022 at 4:20 am #

        Hi, James! How are you?

        I was thinking about this definition of ML:

        “ML is when we use a set of resources (computational and theoric) to build a tool that help us to make decisions in face of complex problems”

        Does it make sense?

        Thanks

  34. Avatar
    Edy Gianez Silva June 13, 2022 at 4:13 pm #

    Hello,

    So, i was wondering with this definition on ML is ok:

    “ML is when we use a set of resources (computational and theoric) to build a tool that help us to make decisions in face of complex problems”

    Thanks

Leave a Reply