Long Short-Term Memory (LSTM) is a type of recurrent neural network that can learn the order dependence between items in a sequence. LSTMs have the promise of being able to learn the context required to make predictions in time series forecasting problems, rather than having this context pre-specified and fixed. Given the promise, there is […]
Search results for "Long Short Term Memory Networks"
A Gentle Introduction to Long Short-Term Memory Networks by the Experts
Long Short-Term Memory (LSTM) networks are a type of recurrent neural network capable of learning order dependence in sequence prediction problems. This is a behavior required in complex problem domains like machine translation, speech recognition, and more. LSTMs are a complex area of deep learning. It can be hard to get your hands around what […]
Understanding Simple Recurrent Neural Networks in Keras
This tutorial is designed for anyone looking for an understanding of how recurrent neural networks (RNN) work and how to use them via the Keras deep learning library. While the Keras library provides all the methods required for solving problems and building applications, it is also important to gain an insight into how everything works. […]
An Introduction to Recurrent Neural Networks and the Math That Powers Them
When it comes to sequential or time series data, traditional feedforward networks cannot be used for learning and prediction. A mechanism is required to retain past or historical information to forecast future values. Recurrent neural networks, or RNNs for short, are a variant of the conventional feedforward artificial neural networks that can deal with sequential […]
When to Use MLP, CNN, and RNN Neural Networks
What neural network is appropriate for your predictive modeling problem? It can be difficult for a beginner to the field of deep learning to know what type of network to use. There are so many types of networks to choose from and new methods being published and discussed every day. To make things worse, most […]
How to Choose Loss Functions When Training Deep Learning Neural Networks
Deep learning neural networks are trained using the stochastic gradient descent optimization algorithm. As part of the optimization algorithm, the error for the current state of the model must be estimated repeatedly. This requires the choice of an error function, conventionally called a loss function, that can be used to estimate the loss of the […]
How to Control the Stability of Training Neural Networks With the Batch Size
Neural networks are trained using gradient descent where the estimate of the error used to update the weights is calculated based on a subset of the training dataset. The number of examples from the training dataset used in the estimate of the error gradient is called the batch size and is an important hyperparameter that […]
A Gentle Introduction to Dropout for Regularizing Deep Neural Networks
Deep learning neural networks are likely to quickly overfit a training dataset with few examples. Ensembles of neural networks with different model configurations are known to reduce overfitting, but require the additional computational expense of training and maintaining multiple models. A single model can be used to simulate having a large number of different network […]
Convolutional Neural Networks for Multi-Step Time Series Forecasting
Given the rise of smart electricity meters and the wide adoption of electricity generation technology like solar panels, there is a wealth of electricity usage data available. This data represents a multivariate time series of power-related variables that in turn could be used to model and even forecast future electricity consumption. Unlike other machine learning […]
A Gentle Introduction to Exploding Gradients in Neural Networks
Exploding gradients are a problem where large error gradients accumulate and result in very large updates to neural network model weights during training. This has the effect of your model being unstable and unable to learn from your training data. In this post, you will discover the problem of exploding gradients with deep artificial neural […]