What is a Hypothesis in Machine Learning?

Supervised machine learning is often described as the problem of approximating a target function that maps inputs to outputs.

This description is characterized as searching through and evaluating candidate hypothesis from hypothesis spaces.

The discussion of hypotheses in machine learning can be confusing for a beginner, especially when “hypothesis” has a distinct, but related meaning in statistics (e.g. statistical hypothesis testing) and more broadly in science (e.g. scientific hypothesis).

In this post, you will discover the difference between a hypothesis in science, in statistics, and in machine learning.

After reading this post, you will know:

  • A scientific hypothesis is a provisional explanation for observations that is falsifiable.
  • A statistical hypothesis is an explanation about the relationship between data populations that is interpreted probabilistically.
  • A machine learning hypothesis is a candidate model that approximates a target function for mapping inputs to outputs.

Let’s get started.

A Gentle Introduction to Hypotheses in Machine Learning

A Gentle Introduction to Hypotheses in Machine Learning
Photo by Bernd Thaller, some rights reserved.

Overview

This tutorial is divided into four parts; they are:

  1. What Is a Hypothesis?
  2. Hypothesis in Statistics
  3. Hypothesis in Machine Learning
  4. Review of Hypothesis

What Is a Hypothesis?

A hypothesis is an explanation for something.

It is a provisional idea, an educated guess that requires some evaluation.

A good hypothesis is testable; it can be either true or false.

In science, a hypothesis must be falsifiable, meaning that there exists a test whose outcome could mean that the hypothesis is not true. The hypothesis must also be framed before the outcome of the test is known.

… not any hypothesis will do. There is one fundamental condition that any hypothesis or system of hypotheses must satisfy if it is to be granted the status of a scientific law or theory. If it is to form part of science, an hypothesis must be falsifiable.

— Pages 61-62, What Is This Thing Called Science?, Third Edition, 1999.

A good hypothesis fits the evidence and can be used to make predictions about new observations or new situations.

The hypothesis that best fits the evidence and can be used to make predictions is called a theory, or is part of a theory.

  • Hypothesis in Science: Provisional explanation that fits the evidence and can be confirmed or disproved.

What Is a Hypothesis in Statistics?

Much of statistics is concerned with the relationship between observations.

Statistical hypothesis tests are techniques used to calculate a critical value called an “effect.” The critical value can then be interpreted in order to determine how likely it is to observe the effect if a relationship does not exist.

If the likelihood is very small, then it suggests that the effect is probably real. If the likelihood is large, then we may have observed a statistical fluctuation, and the effect is probably not real.

For example, we may be interested in evaluating the relationship between the means of two samples, e.g. whether the samples were drawn from the same distribution or not, whether there is a difference between them.

One hypothesis is that there is no difference between the population means, based on the data samples.

This is a hypothesis of no effect and is called the null hypothesis and we can use the statistical hypothesis test to either reject this hypothesis, or fail to reject (retain) it. We don’t say “accept” because the outcome is probabilistic and could still be wrong, just with a very low probability.

… we develop a hypothesis and establish a criterion that we will use when deciding whether to retain or reject our hypothesis. The primary hypothesis of interest in social science research is the null hypothesis

— Pages 64-65, Statistics In Plain English, Third Edition, 2010.

If the null hypothesis is rejected, then we assume the alternative hypothesis that there exists some difference between the means.

  • Null Hypothesis (H0): Suggests no effect.
  • Alternate Hypothesis (H1): Suggests some effect.

Statistical hypothesis tests don’t comment on the size of the effect, only the likelihood of the presence or absence of the effect in the population, based on the observed samples of data.

  • Hypothesis in Statistics: Probabilistic explanation about the presence of a relationship between observations.

What Is a Hypothesis in Machine Learning?

Machine learning, specifically supervised learning, can be described as the desire to use available data to learn a function that best maps inputs to outputs.

Technically, this is a problem called function approximation, where we are approximating an unknown target function (that we assume exists) that can best map inputs to outputs on all possible observations from the problem domain.

An example of a model that approximates the target function and performs mappings of inputs to outputs is called a hypothesis in machine learning.

The choice of algorithm (e.g. neural network) and the configuration of the algorithm (e.g. network topology and hyperparameters) define the space of possible hypothesis that the model may represent.

Learning for a machine learning algorithm involves navigating the chosen space of hypothesis toward the best or a good enough hypothesis that best approximates the target function.

Learning is a search through the space of possible hypotheses for one that will perform well, even on new examples beyond the training set.

— Page 695, Artificial Intelligence: A Modern Approach, Second Edition, 2009.

This framing of machine learning is common and helps to understand the choice of algorithm, the problem of learning and generalization, and even the bias-variance trade-off. For example, the training dataset is used to learn a hypothesis and the test dataset is used to evaluate it.

A common notation is used where lowercase-h (h) represents a given specific hypothesis and uppercase-h (H) represents the hypothesis space that is being searched.

  • h (hypothesis): A single hypothesis, e.g. an instance or specific candidate model that maps inputs to outputs and can be evaluated and used to make predictions.
  • H (hypothesis set): A space of possible hypotheses for mapping inputs to outputs that can be searched, often constrained by the choice of the framing of the problem, the choice of model and the choice of model configuration.

The choice of algorithm and algorithm configuration involves choosing a hypothesis space that is believed to contain a hypothesis that is a good or best approximation for the target function. This is very challenging, and it is often more efficient to spot-check a range of different hypothesis spaces.

We say that a learning problem is realizable if the hypothesis space contains the true function. Unfortunately, we cannot always tell whether a given learning problem is realizable, because the true function is not known.

— Page 697, Artificial Intelligence: A Modern Approach, Second Edition, 2009.

It is a hard problem and we choose to constrain the hypothesis space both in terms of size and in terms of the complexity of the hypotheses that are evaluated in order to make the search process tractable.

There is a tradeoff between the expressiveness of a hypothesis space and the complexity of finding a good hypothesis within that space.

— Page 697, Artificial Intelligence: A Modern Approach, Second Edition, 2009.

  • Hypothesis in Machine Learning: Candidate model that approximates a target function for mapping examples of inputs to outputs.

Review of Hypothesis

We can summarize the three definitions again as follows:

  • Hypothesis in Science: Provisional explanation that fits the evidence and can be confirmed or disproved.
  • Hypothesis in Statistics: Probabilistic explanation about the presence of a relationship between observations.
  • Hypothesis in Machine Learning: Candidate model that approximates a target function for mapping examples of inputs to outputs.

We can see that a hypothesis in machine learning draws upon the definition of a hypothesis more broadly in science.

Just like a hypothesis in science is an explanation that covers available evidence, is falsifiable and can be used to make predictions about new situations in the future, a hypothesis in machine learning has similar properties.

A hypothesis in machine learning:

  1. Covers the available evidence: the training dataset.
  2. Is falsifiable (kind-of): a test harness is devised beforehand and used to estimate performance and compare it to a baseline model to see if is skillful or not.
  3. Can be used in new situations: make predictions on new data.

Did this post clear up your questions about what a hypothesis is in machine learning?
Let me know in the comments below.

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Books

Posts

Discussions

Articles

Summary

In this post, you discovered the difference between a hypothesis in science, in statistics, and in machine learning.

Specifically, you learned:

  • A scientific hypothesis is a provisional explanation for observations that is falsifiable.
  • A statistical hypothesis is an explanation about the relationship between data populations that is interpreted probabilistically.
  • A machine learning hypothesis is a candidate model that approximates a target function for mapping inputs to outputs.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

46 Responses to What is a Hypothesis in Machine Learning?

  1. Avatar
    Tebogo Mogaleemang March 4, 2019 at 8:57 am #

    Thanks for sharing, well explained.

  2. Avatar
    Jac March 5, 2019 at 9:27 pm #

    Many thanks, helped a lot.

  3. Avatar
    Igor Matutinovic March 7, 2019 at 7:30 pm #

    Excellent explanation – thank you!

  4. Avatar
    Chris Fleshner March 8, 2019 at 6:20 am #

    Is it true that once a candidate model as been proven adequate, doesn’t the usage of that candidate yield probabilistic results? Just wanted to be sure I’m not confusing the statistical hypothesis with the Machine Learning definition.

    • Avatar
      Jason Brownlee March 8, 2019 at 8:01 am #

      Not sure I follow, can you elaborate?

      The candidate model is reliable until the assumptions of the model/hypothesis vary, e.g. the distribution of the data changes.

  5. Avatar
    Nagdev Amruthnath March 8, 2019 at 6:21 am #

    Very good article explaining the different types of hypothesis. Clicking that share button now!

  6. Avatar
    Urlish March 8, 2019 at 6:34 am #

    Thanks

    • Avatar
      Jason Brownlee March 8, 2019 at 8:02 am #

      I’m glad it helped.

      • Avatar
        izrahayu Che Hashim March 8, 2019 at 8:52 am #

        Thanks..helped a lot

    • Avatar
      Partha S Nayak April 10, 2019 at 7:41 am #

      The most clean and clear explanation of hypothesis I came across. The single word “effect” has got all in statistical hypothesis, so nicely presented by Dr Jason

  7. Avatar
    Bob March 8, 2019 at 10:01 am #

    Clear and concise. Great work!

  8. Avatar
    Rao March 8, 2019 at 4:23 pm #

    Nice explanation

  9. Avatar
    Renan Macedo March 9, 2019 at 12:09 pm #

    Fantastic explanation! This is very important to apply the results in a real world,

  10. Avatar
    Divya mannemoni August 15, 2019 at 12:18 am #

    why we are restricting the hypothesis space in machine leaning?

    • Avatar
      Jason Brownlee August 15, 2019 at 8:11 am #

      To speed up the search/fit and actually get a model. Otherwise the search space is practically infinite.

  11. Avatar
    Hannes November 9, 2019 at 4:17 pm #

    Thank you chason!
    It helped a lot!
    Could you further explain the concept of a specific and general hypothesis please?

  12. Avatar
    Nuhil January 24, 2020 at 8:13 am #

    Hi Jason,
    Thanks for the article. Could you please elaborate in the scope of Statistical Hypothesis, if we want to know whether “Two Data Samples” are from the same “Distribution/Population” or not – then should we perform hypothesis tests (e.g. P-value) between:

    A. Sample 1 with Population/Distribution 1 – Then observe the likelihood
    B. Sample 1 with Sample 2 – Then observe the likelihood
    C. Sample 1 with Population/Distribution 1 (Population) AND Sample 2 with Population/Distribution 1 (Population) – Then observe the likelihood

  13. Avatar
    Mahmoud Abbasi April 30, 2020 at 4:17 pm #

    Your website is great! Thanks for the useful posts!

  14. Avatar
    Chuks June 15, 2020 at 4:14 pm #

    Your explanation is great. It shed more light on the subject. Thanks a lot

  15. Avatar
    Ramesh Ravula June 15, 2020 at 7:40 pm #

    Never thought about it.Thanks for the explanation.

  16. Avatar
    Sharan Salian July 11, 2020 at 11:41 pm #

    A beautifully crafted explanation introducing all co-related disciplines. Loved how passages from different books are used a better balance of depth and simplicity. Thank you sir.

  17. Avatar
    Shubham Goel August 29, 2020 at 4:18 am #

    Thank you for such a wonderful and explanatory article.

  18. Avatar
    Yusto M. Yustas March 23, 2021 at 7:53 pm #

    Thank you very much for sharing this valuable information

  19. Avatar
    JG July 3, 2021 at 5:02 pm #

    Great explanation Jason !

    about the hypothesis infinitum space search, hypothesis evidence fits on current data explanation , falsifiable test about hypothesis true (evaluation on new data), and finally hypothesis as a way to construct a theory or model or ideas corpus to makes new predictions !!

    very abstracts inspiring ideas!

    Well done!

    Science and machine learning seem to fit pretty well this scenario, but statically hypothesis seems something different!

    Why do you restrict machine learning hypothesis description to supervised learning ?

    Thanks

    • Avatar
      Jason Brownlee July 4, 2021 at 5:59 am #

      Thanks!

      I prefer supervised learning, it’s perhaps more useful in “business”.

  20. Avatar
    Sarah July 20, 2022 at 3:45 am #

    Wow I was reading a paper and after reading your article realized i was trying to apply the ‘scientific’ hypothesis to the paper when in reality i needed to apply the machine learning hypothesis to understand. You really helped me understand the concept of hypothesis and model in the machine learning space. So much makes sense now. Thank You!!

    • Avatar
      James Carmichael July 20, 2022 at 9:07 am #

      Thank you for the great feedback and support Sarah!

Leave a Reply