Scatter Plot of Dataset With Linear Model and Prediction Interval

Prediction Intervals for Machine Learning

A prediction from a machine learning perspective is a single point that hides the uncertainty of that prediction. Prediction intervals provide a way to quantify and communicate the uncertainty in a prediction. They are different from confidence intervals that instead seek to quantify the uncertainty in a population parameter such as a mean or standard […]

Continue Reading 10
Confidence Intervals for Machine Learning

Confidence Intervals for Machine Learning

Much of machine learning involves estimating the performance of a machine learning algorithm on unseen data. Confidence intervals are a way of quantifying the uncertainty of an estimate. They can be used to add a bounds or likelihood on a population parameter, such as a mean, estimated from a sample of independent observations from the […]

Continue Reading 6
A Gentle Introduction to the Bootstrap Method

A Gentle Introduction to the Bootstrap Method

The bootstrap method is a resampling technique used to estimate statistics on a population by sampling a dataset with replacement. It can be used to estimate summary statistics such as the mean or standard deviation. It is used in applied machine learning to estimate the skill of machine learning models when making predictions on data […]

Continue Reading 10
A Gentle Introduction to k-fold Cross-Validation

A Gentle Introduction to k-fold Cross-Validation

Cross-validation is a statistical method used to estimate the skill of machine learning models. It is commonly used in applied machine learning to compare and select a model for a given predictive modeling problem because it is easy to understand, easy to implement, and results in skill estimates that generally have a lower bias than […]

Continue Reading 18
Introduction to Nonparametric Statistical Significance Tests in Python

Introduction to Nonparametric Statistical Significance Tests in Python

In applied machine learning, we often need to determine whether two data samples have the same or different distributions. We can answer this question using statistical significance tests that can quantify the likelihood that the samples have the same distribution. If the data does not have the familiar Gaussian distribution, we must resort to nonparametric […]

Continue Reading 6