Stacking or Stacked Generalization is an ensemble machine learning algorithm. It uses a meta-learning algorithm to learn how to best combine the predictions from two or more base machine learning algorithms. The benefit of stacking is that it can harness the capabilities of a range of well-performing models on a classification or regression task and […]

## 4 Types of Classification Tasks in Machine Learning

Machine learning is a field of study and is concerned with algorithms that learn from examples. Classification is a task that requires the use of machine learning algorithms that learn how to assign a class label to examples from the problem domain. An easy to understand example is classifying emails as “spam” or “not spam.” […]

## What Is Argmax in Machine Learning?

Argmax is a mathematical function that you may encounter in applied machine learning. For example, you may see “argmax” or “arg max” used in a research paper used to describe an algorithm. You may also be instructed to use the argmax function in your algorithm implementation. This may be the first time that you encounter […]

## 4 Distance Measures for Machine Learning

Distance measures play an important role in machine learning. They provide the foundation for many popular and effective machine learning algorithms like k-nearest neighbors for supervised learning and k-means clustering for unsupervised learning. Different distance measures must be chosen and used depending on the types of the data. As such, it is important to know […]

## Basic Data Cleaning for Machine Learning (That You Must Perform)

Data cleaning is a critically important step in any machine learning project. In tabular data, there are many different statistical analysis and data visualization techniques you can use to explore your data in order to identify data cleaning operations you may want to perform. Before jumping to the sophisticated methods, there are some very basic […]

## A Gentle Introduction to the Fbeta-Measure for Machine Learning

Fbeta-measure is a configurable single-score metric for evaluating a binary classification model based on the predictions made for the positive class. The Fbeta-measure is calculated using precision and recall. Precision is a metric that calculates the percentage of correct predictions for the positive class. Recall calculates the percentage of correct predictions for the positive class […]

## Standard Machine Learning Datasets for Imbalanced Classification

An imbalanced classification problem is a problem that involves predicting a class label where the distribution of class labels in the training dataset is skewed. Many real-world classification problems have an imbalanced class distribution, therefore it is important for machine learning practitioners to get familiar with working with these types of problems. In this tutorial, […]

## Results for Standard Classification and Regression Machine Learning Datasets

It is important that beginner machine learning practitioners practice on small real-world datasets. So-called standard machine learning datasets contain actual observations, fit into memory, and are well studied and well understood. As such, they can be used by beginner practitioners to quickly test, explore, and practice data preparation and modeling techniques. A practitioner can confirm […]

## Arithmetic, Geometric, and Harmonic Means for Machine Learning

Calculating the average of a variable or a list of numbers is a common operation in machine learning. It is an operation you may use every day either directly, such as when summarizing data, or indirectly, such as a smaller step in a larger procedure when fitting a model. The average is a synonym for […]