Search results for "mutual information"

Information Gain and Mutual Information for Machine Learning

By Jason Brownlee on December 10, 2020 in Probability 58

Information gain calculates the reduction in entropy or surprise from transforming a dataset in some way. It is commonly used in the construction of decision trees from a training dataset, by evaluating the information gain for each variable, and selecting the variable that maximizes the information gain, which in turn minimizes the entropy and best […]

Plot of Probability Distribution vs Entropy

A Gentle Introduction to Information Entropy

By Jason Brownlee on July 13, 2020 in Probability 51

Information theory is a subfield of mathematics concerned with transmitting data across a noisy channel. A cornerstone of information theory is the idea of quantifying how much information there is in a message. More generally, this can be used to quantify the information in an event and a random variable, called entropy, and is calculated […]

How to Develop an Information Maximizing Generative Adversarial Network (InfoGAN) in Keras

How to Develop an Information Maximizing GAN (InfoGAN) in Keras

By Jason Brownlee on January 18, 2021 in Generative Adversarial Networks 48

The Generative Adversarial Network, or GAN, is an architecture for training deep convolutional models for generating synthetic images. Although remarkably effective, the default GAN provides no control over the types of images that are generated. The Information Maximizing GAN, or InfoGAN for short, is an extension to the GAN architecture that introduces control variables that […]

Box and Whisker Plots of Accuracy of Singles Model Fit On Selected Features vs. Ensemble

How to Develop a Feature Selection Subspace Ensemble in Python

By Jason Brownlee on April 27, 2021 in Ensemble Learning 27

Random subspace ensembles consist of the same model fit on different randomly selected groups of input features (columns) in the training dataset. There are many ways to choose groups of features in the training dataset, and feature selection is a popular class of data preparation techniques designed specifically for this purpose. The features selected by […]

Bar Chart of the Input Features (x) vs. the Mutual Information Feature Importance (y)

How to Perform Feature Selection for Regression Data

By Jason Brownlee on August 18, 2020 in Data Preparation 47

Feature selection is the process of identifying and selecting a subset of input variables that are most relevant to the target variable. Perhaps the simplest case of feature selection is the case where there are numerical input variables and a numerical target for regression predictive modeling. This is because the strength of the relationship between […]

How to Perform Feature Selection With Numerical Input Data

By Jason Brownlee on August 18, 2020 in Data Preparation 30

Feature selection is the process of identifying and selecting a subset of input features that are most relevant to the target variable. Feature selection is often straightforward when working with real-valued input and output data, such as using the Pearson’s correlation coefficient, but can be challenging when working with numerical input data and a categorical […]

How to Choose Feature Selection Methods For Machine Learning

How to Choose a Feature Selection Method For Machine Learning

By Jason Brownlee on August 20, 2020 in Data Preparation 281

Feature selection is the process of reducing the number of input variables when developing a predictive model. It is desirable to reduce the number of input variables to both reduce the computational cost of modeling and, in some cases, to improve the performance of the model. Statistical-based feature selection methods involve evaluating the relationship between […]

Bar Chart of the Input Features (x) vs The Chi Squared Feature Importance (y)

How to Perform Feature Selection with Categorical Data

By Jason Brownlee on August 18, 2020 in Data Preparation 111

Feature selection is the process of identifying and selecting a subset of input features that are most relevant to the target variable. Feature selection is often straightforward when working with real-valued data, such as using the Pearson’s correlation coefficient, but can be challenging when working with categorical data. The two most commonly used feature selection […]

Histogram of Two Different Probability Distributions for the Same Random Variable

How to Calculate the KL Divergence for Machine Learning

By Jason Brownlee on November 1, 2019 in Probability 76

It is often desirable to quantify the difference between probability distributions for a given random variable. This occurs frequently in machine learning, when we may be interested in calculating the difference between an actual and observed probability distribution. This can be achieved using techniques from information theory, such as the Kullback-Leibler Divergence (KL divergence), or […]

Train Neural Networks With Noise to Reduce Overfitting

By Jason Brownlee on August 6, 2019 in Deep Learning Performance 33

Training a neural network with a small dataset can cause the network to memorize all training examples, in turn leading to overfitting and poor performance on a holdout dataset. Small datasets may also represent a harder mapping problem for neural networks to learn, given the patchy or sparse sampling of points in the high-dimensional input […]

1 2 Next →