Predictive modeling problems where the training dataset is small relative to the number of unlabeled examples are challenging. Neural networks can perform well on these types of problems, although they can suffer from high variance in model performance as measured on a training or hold-out validation datasets. This makes choosing which model to use as […]

# Archive | Better Deep Learning

## How to Create a Random-Split, Cross-Validation, and Bagging Ensemble for Deep Learning in Keras

Ensemble learning are methods that combine the predictions from multiple models. It is important in ensemble learning that the models that comprise the ensemble are good, making different prediction errors. Predictions that are good in different ways can result in a prediction that is both more stable and often better than the predictions of any […]

## How to Reduce the Variance of Deep Learning Models in Keras Using Model Averaging Ensembles

Deep learning neural network models are highly flexible nonlinear algorithms capable of learning a near infinite number of mapping functions. A frustration with this flexibility is the high variance in a final model. The same neural network model trained on the same dataset may find one of many different possible “good enough” solutions each time […]

## Ensemble Methods for Deep Learning Neural Networks to Reduce Variance and Improve Performance

Deep learning neural networks are nonlinear methods. They offer increased flexibility and can scale in proportion to the amount of training data available. A downside of this flexibility is that they learn via a stochastic training algorithm which means that they are sensitive to the specifics of the training data and may find a different […]

## Introduction to Regularization to Reduce Overfitting of Deep Learning Neural Networks

Training a deep neural network that can generalize well to new data is a challenging problem. A model with too little capacity cannot learn the problem, whereas a model with too much capacity can learn it too well and overfit the training dataset. Both cases result in a model that does not generalize well. A […]

## How to Improve Deep Learning Model Robustness by Adding Noise

Adding noise to an underconstrained neural network model with a small training dataset can have a regularizing effect and reduce overfitting. Keras supports the addition of Gaussian noise via a separate layer called the GaussianNoise layer. This layer can be used to add noise to an existing model. In this tutorial, you will discover how […]

## Train Neural Networks With Noise to Reduce Overfitting

Training a neural network with a small dataset can cause the network to memorize all training examples, in turn leading to poor performance on a holdout dataset. Small datasets may also represent a harder mapping problem for neural networks to learn, given the patchy or sparse sampling of points in the high-dimensional input space. One […]

## How to Stop Training Deep Neural Networks At the Right Time Using Early Stopping

A problem with training neural networks is in the choice of the number of training epochs to use. Too many epochs can lead to overfitting of the training dataset, whereas too few may result in an underfit model. Early stopping is a method that allows you to specify an arbitrary large number of training epochs […]

## A Gentle Introduction to Early Stopping to Avoid Overtraining Deep Learning Neural Network Models

A major challenge in training neural networks is how long to train them. Too little training will mean that the model will underfit the train and the test sets. Too much training will mean that the model will overfit the training dataset and have poor performance on the test set. A compromise is to train […]

## How to Reduce Overfitting With Dropout Regularization in Keras

Dropout regularization is a computationally cheap way to regularize a deep neural network. Dropout works by probabilistically removing, or “dropping out,” inputs to a layer, which may be input variables in the data sample or activations from a previous layer. It has the effect of simulating a large number of networks with very different network […]