Archive | Better Deep Learning

Line Plot of Train and Test Set Accuracy of Over Training Epochs for Deep MLP with ReLU with 15 Hidden Layers

How to Fix the Vanishing Gradients Problem Using the Rectified Linear Unit (ReLU)

The vanishing gradients problem is one example of unstable behavior that you may encounter when training a deep neural network. It describes the situation where a deep multilayer feed-forward network or a recurrent neural network is unable to propagate useful gradient information from the output end of the model back to the layers near the […]

Continue Reading 8
Line Plot of Rectified Linear Activation for Negative and Positive Inputs

A Gentle Introduction to the Rectified Linear Unit (ReLU) for Deep Learning Neural Networks

In a neural network, the activation function is responsible for transforming the summed weighted input from the node into the activation of the node or output for that input. The rectified linear activation function is a piecewise linear function that will output the input directly if is positive, otherwise, it will output zero. It has […]

Continue Reading 12
Line Plot of Cosine Annealing Learning Rate Schedule

How to Develop a Snapshot Ensemble Deep Learning Neural Network in Python With Keras

Model ensembles can achieve lower generalization error than single models but are challenging to develop with deep learning neural networks given the computational cost of training each single model. An alternative is to train multiple model snapshots during a single training run and combine their predictions to make an ensemble prediction. A limitation of this […]

Continue Reading 8
Four Scatter Plots of the Circles Dataset Varied by the Amount of Statistical Noise

Impact of Dataset Size on Deep Learning Model Skill And Performance Estimates

Supervised learning is challenging, although the depths of this challenge are often learned then forgotten or willfully ignored. This must be the case, because dwelling too long on this challenge may result in a pessimistic outlook. In spite of the challenge, we continue to wield supervised learning algorithms and they perform well in practice. Fundamental […]

Continue Reading 4
Visualization of Stacked Generalization Ensemble of Neural Network Models

How to Develop a Stacking Ensemble for Deep Learning Neural Networks in Python With Keras

Model averaging is an ensemble technique where multiple sub-models contribute equally to a combined prediction. Model averaging can be improved by weighting the contributions of each sub-model to the combined prediction by the expected performance of the submodel. This can be extended further by training an entirely new model to learn how to best combine […]

Continue Reading 63
Line Plot Learning Curves of Model Accuracy on Train and Test Dataset over Each Training Epoch

How to Develop a Weighted Average Ensemble for Deep Learning Neural Networks

A modeling averaging ensemble combines the prediction from each model equally and often results in better performance on average than a given single model. Sometimes there are very good models that we wish to contribute more to an ensemble prediction, and perhaps less skillful models that may be useful but should contribute less to an […]

Continue Reading 26
Line Plot Showing Single Model Accuracy (blue dots) vs Accuracy of Ensembles of Varying Size With a Horizontal Voting Ensemble

How to Develop a Horizontal Voting Deep Learning Ensemble to Reduce Variance

Predictive modeling problems where the training dataset is small relative to the number of unlabeled examples are challenging. Neural networks can perform well on these types of problems, although they can suffer from high variance in model performance as measured on a training or hold-out validation datasets. This makes choosing which model to use as […]

Continue Reading 0