Archive | Long Short-Term Memory Networks

LSTM Autoencoder Model With Two Decoders

A Gentle Introduction to LSTM Autoencoders

By Jason Brownlee on August 27, 2020 in Long Short-Term Memory Networks 323

An LSTM Autoencoder is an implementation of an autoencoder for sequence data using an Encoder-Decoder LSTM architecture. Once fit, the encoder part of the model can be used to encode or compress sequence data that in turn may be used in data visualizations or as a feature vector input to a supervised learning model. In […]

A Gentle Introduction to Exploding Gradients in Recurrent Neural Networks

A Gentle Introduction to Exploding Gradients in Neural Networks

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 41

Exploding gradients are a problem where large error gradients accumulate and result in very large updates to neural network model weights during training. This has the effect of your model being unstable and unable to learn from your training data. In this post, you will discover the problem of exploding gradients with deep artificial neural […]

What is Teacher Forcing for Recurrent Neural Networks?

By Jason Brownlee on April 8, 2021 in Long Short-Term Memory Networks 51

Teacher forcing is a method for quickly and efficiently training recurrent neural network models that use the ground truth from a prior time step as input. It is a network training method critical to the development of deep learning language models used in machine translation, text summarization, and image captioning, among many other applications. In […]

How to Develop an Encoder-Decoder Model for Sequence-to-Sequence Prediction in Keras

By Jason Brownlee on August 27, 2020 in Long Short-Term Memory Networks 390

The encoder-decoder model provides a pattern for using recurrent neural networks to address challenging sequence-to-sequence prediction problems such as machine translation. Encoder-decoder models can be developed in the Keras Python deep learning library and an example of a neural machine translation system developed with this model has been described on the Keras blog, with sample […]

Gentle Introduction to Global Attention for Encoder-Decoder Recurrent Neural Networks

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 12

The encoder-decoder model provides a pattern for using recurrent neural networks to address challenging sequence-to-sequence prediction problems such as machine translation. Attention is an extension to the encoder-decoder model that improves the performance of the approach on longer sequences. Global attention is a simplification of attention that may be easier to implement in declarative deep […]

Difference Between Return Sequences and Return States for LSTMs in Keras

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 145

The Keras deep learning library provides an implementation of the Long Short-Term Memory, or LSTM, recurrent neural network. As part of this implementation, the Keras API provides access to both return sequences and return state. The use and difference between these data can be confusing when designing sophisticated recurrent neural network models, such as the […]

Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 6

The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing. Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the learning and lifts the skill of the […]

How to Develop an Encoder-Decoder Model with Attention for Sequence-to-Sequence Prediction in Keras

How to Develop an Encoder-Decoder Model with Attention in Keras

By Jason Brownlee on August 27, 2020 in Long Short-Term Memory Networks 358

The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing such as machine translation and caption generation. Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the […]

Example of Unrolled RNN on the forward pass

A Gentle Introduction to RNN Unrolling

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 26

Recurrent neural networks are a type of neural network where the outputs from previous time steps are fed as input to the current time step. This creates a network graph or circuit diagram with cycles, which can make it difficult to understand how information moves through the network. In this post, you will discover the […]

Making Predictions with Sequences

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 219

Sequence prediction is different from other types of supervised learning problems. The sequence imposes an order on the observations that must be preserved when training models and making predictions. Generally, prediction problems that involve sequence data are referred to as sequence prediction problems, although there are a suite of problems that differ based on the […]

1 2 … 4 Next →