Blog - Page 76 of 137

17 Statistical Hypothesis Tests in Python (Cheat Sheet)

By Jason Brownlee on November 7, 2021 in Statistics 99

Quick-reference guide to the 17 statistical hypothesis tests that you need in applied machine learning, with sample code in Python. Although there are hundreds of statistical hypothesis tests that you could use, there is only a small subset that you may need to use in a machine learning project. In this post, you will discover […]

How to Reduce Variance in a Final Machine Learning Model

By Jason Brownlee on April 27, 2021 in Ensemble Learning 40

A final machine learning model is one trained on all available data and is then used to make predictions on new data. A problem with most final models is that they suffer variance in their predictions. This means that each time you fit a model, you get a slightly different set of parameters that in […]

How to Develop a Skilful Time Series Forecasting Model

How to Develop a Skillful Machine Learning Time Series Forecasting Model

By Jason Brownlee on August 5, 2019 in Deep Learning for Time Series 42

You are handed data and told to develop a forecast model. What do you do? This is a common situation; far more common than most people think. Perhaps you are sent a CSV file. Perhaps you are given access to a database. Perhaps you are starting a competition. The problem can be reasonably well defined: […]

Taxonomy of Time Series Forecasting Problems

By Jason Brownlee on August 5, 2019 in Deep Learning for Time Series 40

When you are presented with a new time series forecasting problem, there are many things to consider. The choice that you make directly impacts each step of the project from the design of a test harness to evaluate forecast models to the fundamental difficulty of the forecast problem that you are working on. It is […]

11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)

By Jason Brownlee on November 16, 2023 in Time Series 365

Let’s dive into how machine learning methods can be used for the classification and forecasting of time series problems with Python. But first let’s go back and appreciate the classics, where we will delve into a suite of classical methods for time series forecasting that you can test on your forecasting problem prior to exploring […]

Statistics for Machine Learning (7-Day Mini-Course)

By Jason Brownlee on August 8, 2019 in Statistics 328

Statistics for Machine Learning Crash Course. Get on top of the statistics used in machine learning in 7 Days. Statistics is a field of mathematics that is universally agreed to be a prerequisite for a deeper understanding of machine learning. Although statistics is a large field with many esoteric theories and findings, the nuts and […]

How to Code the Student’s t-Test from Scratch in Python

By Jason Brownlee on August 8, 2019 in Statistics 52

Perhaps one of the most widely used statistical hypothesis tests is the Student’s t test. Because you may use this test yourself someday, it is important to have a deep understanding of how the test works. As a developer, this understanding is best achieved by implementing the hypothesis test yourself from scratch. In this tutorial, […]

How to Configure the Number of Layers and Nodes in a Neural Network

By Jason Brownlee on August 6, 2019 in Deep Learning Performance 74

Artificial neural networks have two main hyperparameters that control the architecture or topology of the network: the number of layers and the number of nodes in each hidden layer. You must specify values for these parameters when configuring your network. The most reliable way to configure these hyperparameters for your specific predictive modeling problem is […]

How to Calculate McNemar's Test for Two Machine Learning Classifiers

How to Calculate McNemar’s Test to Compare Two Machine Learning Classifiers

By Jason Brownlee on August 8, 2019 in Statistics 99

The choice of a statistical hypothesis test is a challenging open problem for interpreting machine learning results. In his widely cited 1998 paper, Thomas Dietterich recommended the McNemar’s test in those cases where it is expensive or impractical to train multiple copies of classifier models. This describes the current situation with deep learning models that […]

The Role of Randomization to Address Confounding Variables in Machine Learning

By Jason Brownlee on July 31, 2020 in Statistics 8

A large part of applied machine learning is about running controlled experiments to discover what algorithm or algorithm configuration to use on a predictive modeling problem. A challenge is that there are aspects of the problem and the algorithm called confounding variables that cannot be controlled (held constant) and must be controlled-for. An example is […]

← Previous 1 … 75 76 77 … 137 Next →

Navigation

MachineLearningMastery.com