Archive | Better Deep Learning

Loss and Accuracy Learning Curves on the Train and Test Sets for an MLP on Problem 1

How to Improve Performance With Transfer Learning for Deep Learning Neural Networks

An interesting benefit of deep learning neural networks is that they can be reused on related problems. Transfer learning refers to a technique for predictive modeling on a different but somehow similar problem that can then be reused partly or wholly to accelerate the training and improve the performance of a model on the problem […]

Continue Reading 6
How to Avoid Exploding Gradients in Neural Networks With Gradient Clipping

How to Avoid Exploding Gradients in Neural Networks With Gradient Clipping

Training a neural network can become unstable given the choice of error function, learning rate, or even the scale of the target variable. Large updates to weights during training can cause a numerical overflow or underflow often referred to as “exploding gradients.” The problem of exploding gradients is more common with recurrent neural networks, such […]

Continue Reading 4
Line Plot for Supervised Greedy Layer-Wise Pretraining Showing Model Layers vs Train and Test Set Classification Accuracy on the Blobs Classification Problem

How to Develop Deep Learning Neural Networks With Greedy Layer-Wise Pretraining

Training deep neural networks was traditionally challenging as the vanishing gradient meant that weights in layers close to the input layer were not updated in response to errors calculated on the training dataset. An innovation and important milestone in the field of deep learning was greedy layer-wise pretraining that allowed very deep neural networks to […]

Continue Reading 14
Line Plots of KL Divergence Loss and Classification Accuracy over Training Epochs on the Blobs Multi-Class Classification Problem

How to Choose Loss Functions When Training Deep Learning Neural Networks

Deep learning neural networks are trained using the stochastic gradient descent optimization algorithm. As part of the optimization algorithm, the error for the current state of the model must be estimated repeatedly. This requires the choice of an error function, conventionally called a loss function, that can be used to estimate the loss of the […]

Continue Reading 18
Line Plots of Train and Test Accuracy for a Suite of Learning Rates on the Blobs Classification Problem

Understand the Impact of Learning Rate on Model Performance With Deep Learning Neural Networks

Deep learning neural networks are trained using the stochastic gradient descent optimization algorithm. The learning rate is a hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated. Choosing the learning rate is challenging as a value too small may result in a […]

Continue Reading 22
How to Configure the Learning Rate Hyperparameter When Training Deep Learning Neural Networks

How to Configure the Learning Rate Hyperparameter When Training Deep Learning Neural Networks

The weights of a neural network cannot be calculated using an analytical method. Instead, the weights must be discovered via an empirical optimization procedure called stochastic gradient descent. The optimization problem addressed by stochastic gradient descent for neural networks is challenging and the space of solutions (sets of weights) may be comprised of many good […]

Continue Reading 12
Line Plots of Classification Accuracy on Train and Test Datasets With Different Batch Sizes

How to Control the Speed and Stability of Training Neural Networks With Gradient Descent Batch Size

Neural networks are trained using gradient descent where the estimate of the error used to update the weights is calculated based on a subset of the training dataset. The number of examples from the training dataset used in the estimate of the error gradient is called the batch size and is an important hyperparameter that […]

Continue Reading 6
Line Plot Classification Accuracy of MLP With Batch Normalization After Activation Function on Train and Test Datasets Over Training Epochs

How to Accelerate Learning of Deep Neural Networks With Batch Normalization

Batch normalization is a technique designed to automatically standardize the inputs to a layer in a deep learning neural network. Once implemented, batch normalization has the effect of dramatically accelerating the training process of a neural network, and in some cases improves the performance of the model via a modest regularization effect. In this tutorial, […]

Continue Reading 13