Blog - Page 43 of 136

Probability Decision Surface for Logistic Regression on a Binary Classification Task

Plot a Decision Surface for Machine Learning Algorithms in Python

By Jason Brownlee on August 26, 2020 in Python Machine Learning 19

Classification algorithms learn how to assign class labels to examples, although their decisions can appear opaque. A popular diagnostic for understanding the decisions made by a classification algorithm is the decision surface. This is a plot that shows how a fit machine learning algorithm predicts a coarse grid across the input feature space. A decision […]

A Gentle Introduction to Computational Learning Theory

By Jason Brownlee on September 7, 2020 in Probability 11

Computational learning theory, or statistical learning theory, refers to mathematical frameworks for quantifying learning tasks and algorithms. These are sub-fields of machine learning that a machine learning practitioner does not need to know in great depth in order to achieve good results on a wide range of problems. Nevertheless, it is a sub-field where having […]

Scatter Plot of Number of Times Pregnant vs. Plasma Glucose Numerical Variables by Class Label

How to use Seaborn Data Visualization for Machine Learning

By Jason Brownlee on August 19, 2020 in Python Machine Learning 19

Data visualization provides insight into the distribution and relationships between variables in a dataset. This insight can be helpful in selecting data preparation techniques to apply prior to modeling and the types of algorithms that may be most suited to the data. Seaborn is a data visualization library for Python that runs on top of […]

Histogram of Examples in Each Class in the Glass Multi-Class Classification Dataset

Multi-Class Imbalanced Classification

By Jason Brownlee on January 5, 2021 in Imbalanced Classification 63

Imbalanced classification are those prediction tasks where the distribution of examples across class labels is not equal. Most imbalanced classification examples focus on binary classification tasks, yet many of the tools and techniques for imbalanced classification also directly support multi-class classification problems. In this tutorial, you will discover how to use the tools of imbalanced […]

Line Plot of Expected vs. Births Predicted Using XGBoost

How to Use XGBoost for Time Series Forecasting

By Jason Brownlee on March 19, 2021 in XGBoost 135

XGBoost is an efficient implementation of gradient boosting for classification and regression problems. It is both fast and efficient, performing well, if not the best, on a wide range of predictive modeling tasks and is a favorite among data science competition winners, such as those on Kaggle. XGBoost can also be used for time series […]

Box and Whisker Plots of Classification Accuracy vs Repeats for k-Fold Cross-Validation

Repeated k-Fold Cross-Validation for Model Evaluation in Python

By Jason Brownlee on August 26, 2020 in Python Machine Learning 52

The k-fold cross-validation procedure is a standard method for estimating the performance of a machine learning algorithm or configuration on a dataset. A single run of the k-fold cross-validation procedure may result in a noisy estimate of model performance. Different splits of the data may result in very different results. Repeated k-fold cross-validation provides a […]

Line Plot of Mean Accuracy for Cross-Validation k-Values With Error Bars (Blue) vs. the Ideal Case (red)

How to Configure k-Fold Cross-Validation

By Jason Brownlee on August 26, 2020 in Python Machine Learning 49

The k-fold cross-validation procedure is a standard method for estimating the performance of a machine learning algorithm on a dataset. A common value for k is 10, although how do we know that this configuration is appropriate for our dataset and our algorithms? One approach is to explore the effect of different k values on […]

Nested Cross-Validation for Machine Learning with Python

By Jason Brownlee on November 20, 2021 in Python Machine Learning 160

The k-fold cross-validation procedure is used to estimate the performance of machine learning models when making predictions on data not used during training. This procedure can be used both when optimizing the hyperparameters of a model on a dataset, and when comparing and selecting a model for the dataset. When the same cross-validation procedure and […]

LOOCV for Evaluating Machine Learning Algorithms

By Jason Brownlee on August 26, 2020 in Python Machine Learning 51

The Leave-One-Out Cross-Validation, or LOOCV, procedure is used to estimate the performance of machine learning algorithms when they are used to make predictions on data not used to train the model. It is a computationally expensive procedure to perform, although it results in a reliable and unbiased estimate of model performance. Although simple to use […]

Train-Test Split for Evaluating Machine Learning Algorithms

By Jason Brownlee on August 26, 2020 in Python Machine Learning 77

The train-test split procedure is used to estimate the performance of machine learning algorithms when they are used to make predictions on data not used to train the model. It is a fast and easy procedure to perform, the results of which allow you to compare the performance of machine learning algorithms for your predictive […]

← Previous 1 … 42 43 44 … 136 Next →

Navigation

MachineLearningMastery.com