Controlled Experiments in Machine Learning

Systematic experimentation is a key part of applied machine learning.

Given the complexity of machine learning methods, they resist formal analysis methods. Therefore, we must learn about the behavior of algorithms on our specific problems empirically. We do this using controlled experiments.

In this tutorial, you will discover the important role that controlled experiments play in applied machine learning.

After completing this tutorial, you will know:

  • The need for systematic discovery via controlled experiments.
  • The need to repeat experiments in order to control for the sources of variance.
  • Examples of experiments performed in machine learning and the challenge and opportunity they represent.

Kick-start your project with my new book Statistics for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Controlled Experiments in Machine Learning

Controlled Experiments in Machine Learning
Photo by Mike Baird, some rights reserved.

Tutorial Overview

This tutorial is divided into 3 parts; they are:

  1. Systematic Experimentation
  2. Controlling For Variance
  3. Experiments in Machine Learning

Need help with Statistics for Machine Learning?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Systematic Experimentation

In applied machine learning, you must become the scientist and perform systematic experiments.

The answers to questions that you care about, such as what algorithm works best on your data or which input features to use, can only be found through the results of experimental trials.

This is due mainly to the fact that machine learning methods are complex and resist formal methods of analysis.

[…] many learning algorithms are too complex for formal analysis, at least at the level of generality assumed by most theoretical treatments. As a result, empirical studies of the behavior of machine learning algorithms must retain a central role.

The Experimental Study of Machine Learning, 1991.

In statistics, the choice of a type of experiment is called experimental design, and there are many types of experiments to choose from. For example, you may have heard that randomized double-blind placebo-controlled experimentation as the gold standard for evaluating the effectiveness of medical treatments.

Applied machine learning is special in that we have complete control over the experiment and we can run as few or as many trials as we wish on our computer. Because of the ease of running experiments, it is important that we are running the right types of experiments.

In the natural sciences, one can never control all possible variables. […] As a science of the artificial, machine learning can usually avoid such complications.

Machine Learning as an Experimental Science, Editorial, 1998.

The type of experiments we wish to perform are called controlled experiments.

These are experiments where all known independent variables are held constant and modified one at a time in order to determine their impact on the dependent variable. The results are compared to a baseline, or no-treatment, called a “control.” This could be the result of a baseline method like persistence or the Zero Rule algorithm or the default-configuration for the method.

As normally defined, an experiment involves systematically varying one or more independent variables and examining their effect on some dependent variables. Thus, a machine learning experiment requires more than a single learning run; it requires a number of runs carried out under different conditions. In each case, one must measure some aspect of the system’s behavior for comparison across the different conditions.

Machine Learning as an Experimental Science, Editorial, 1998.

Controlling For Variance

In many ways, experiments with machine learning methods have more in common with simulation studies, such as those in physics, than with evaluating medical treatments.

As such, the results of a single experiment are probabilistic, subjected to variance.

There are two main types of variance that we seek to understand in our controlled experiments; they are:

  • Variance in the data, such as the data used to train the learning algorithm and the data used to evaluate its skill.
  • Variance in the model, such as the use of randomness in the learning algorithm, such as random initial weights in neural nets, selection of cut points in bagging, shuffled order of data in stochastic gradient descent, and so on.

A result from a single run or trial of a controlled experiment would be misleading given these sources of variance.

The experiment must control for these sources of variance. This is done by repeating the experimental trial multiple times in order to elicit the range of variance so that we can both report the expected result and the variance in the expected result, e.g. mean and confidence interval.

In simulation studies, such as Monte Carlo methods, the repetition of an experiment is called variance reduction.

Experiments in Machine Learning

Experimentation is a key part of applied machine learning.

This is both a challenge to beginners who must learn some rigor and an exciting opportunity for discovery and contribution.

Let’s make this concrete with some examples of the types of controlled experiments you may need to perform:

  • Choose-Features Experiments. When determining what data features (input variables) are most relevant to a model, the independent variables may be the input features and the dependent variable might be the estimated skill of the model on unseen data.
  • Tune-Model Experiments. When tuning a machine learning model, the independent variables may be the hyperparameters of the learning algorithm and the dependent variable might be the estimated skill of the model on unseen data.
  • Compare-Models Experiments. When comparing the performance of machine learning models, the independent variables may be the learning algorithms themselves with a specific configuration and the dependent variable is the estimated skill of the model on unseen data.

What makes the experimental focus of applied machine learning so exciting is two fold:

  • Discovery. You can discover what works best for your specific problem and data. A challenge and an opportunity.
  • Contribution. You can make broader discoveries in the field, without any specialized knowledge other than rigorous and systematic experimentation.

Using off-the-shelf tools and careful experimental methods, you can make discoveries and contributions.

In summary machine learning occupies a fortunate position that makes systematic experimentation easy and profitable. […] Although experimental studies are not the only path to understanding, we feel they constitute one of machine learning s brightest hopes for rapid scientific progress, and we encourage other researchers to join in our fields evolution toward an experimental science.

The Experimental Study of Machine Learning, 1991.

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Books

Papers

Articles

Summary

In this tutorial, you discovered the important role that controlled experiments play in applied machine learning.

Specifically, you learned:

  • The need for systematic discovery via controlled experiments.
  • The need to repeat experiments in order to control for the sources of variance.
  • Examples of experiments performed in machine learning and the challenge and opportunity they represent.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Get a Handle on Statistics for Machine Learning!

Statistical Methods for Machine Learning

Develop a working understanding of statistics

...by writing lines of code in python

Discover how in my new Ebook:
Statistical Methods for Machine Learning

It provides self-study tutorials on topics like:
Hypothesis Tests, Correlation, Nonparametric Stats, Resampling, and much more...

Discover how to Transform Data into Knowledge

Skip the Academics. Just Results.

See What's Inside

4 Responses to Controlled Experiments in Machine Learning

  1. Avatar
    Adrien Pavao December 12, 2019 at 4:32 pm #

    Very interesting. Thank you for sharing references on this subject.

  2. Avatar
    Jesuino Vieira Filho April 19, 2022 at 4:55 am #

    When we are running successive experiments to test the effectiveness of different steps in our machine learning system, is it common to discard alternatives with inferior results for any further comparison?

    For example, let’s say I have two steps. The first has three options and the second has five:

    – Step 1: S1O1, S1O2, S1O3
    – Step 2: S2O1, S2O2, S2O3, S2O4, S2O5

    Assuming that S1O2 is the best option for step 1 when performing step two, can I ignore S101 and S103 or should I test all combinations?

Leave a Reply