Archive | Long Short-Term Memory Networks

Long Short-Term Memory Networks

What is Teacher Forcing for Recurrent Neural Networks?

What is Teacher Forcing for Recurrent Neural Networks?

Teacher forcing is a method for quickly and efficiently training recurrent neural network models that use the output from a prior time step as input. It is a network training method critical to the development of deep learning language models used in machine translation, text summarization, and image captioning, among many other applications. In this […]

Continue Reading 0
How to Prepare Univariate Time Series Data for Long Short-Term Memory Networks

How to Prepare Univariate Time Series Data for Long Short-Term Memory Networks

It can be hard to prepare data when you’re just getting started with deep learning. Long Short-Term Memory, or LSTM, recurrent neural networks expect three-dimensional input in the Keras Python deep learning library. If you have a long sequence of thousands of observations in your time series data, you must split your time series into […]

Continue Reading 13
How to Develop an Encoder-Decoder Model for Sequence-to-Sequence Prediction in Keras

How to Develop an Encoder-Decoder Model for Sequence-to-Sequence Prediction in Keras

The encoder-decoder model provides a pattern for using recurrent neural networks to address challenging sequence-to-sequence prediction problems such as machine translation. Encoder-decoder models can be developed in the Keras Python deep learning library and an example of a neural machine translation system developed with this model has been described on the Keras blog, with sample […]

Continue Reading 23
Gentle Introduction to Global Attention for Encoder-Decoder Recurrent Neural Networks

Gentle Introduction to Global Attention for Encoder-Decoder Recurrent Neural Networks

The encoder-decoder model provides a pattern for using recurrent neural networks to address challenging sequence-to-sequence prediction problems such as machine translation. Attention is an extension to the encoder-decoder model that improves the performance of the approach on longer sequences. Global attention is a simplification of attention that may be easier to implement in declarative deep […]

Continue Reading 5
Understand the Difference Between Return Sequences and Return States for LSTMs in Keras

Understand the Difference Between Return Sequences and Return States for LSTMs in Keras

The Keras deep learning library provides an implementation of the Long Short-Term Memory, or LSTM, recurrent neural network. As part of this implementation, the Keras API provides access to both return sequences and return state. The use and difference between these data can be confusing when designing sophisticated recurrent neural network models, such as the […]

Continue Reading 22
Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention

Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention

The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing. Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the learning and lifts the skill of the […]

Continue Reading 6
How to Develop an Encoder-Decoder Model with Attention for Sequence-to-Sequence Prediction in Keras

How to Develop an Encoder-Decoder Model with Attention for Sequence-to-Sequence Prediction in Keras

The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing such as machine translation and caption generation. Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the […]

Continue Reading 41
Gentle Introduction to Making Predictions with Sequences

Making Predictions with Sequences

Sequence prediction is different from other types of supervised learning problems. The sequence imposes an order on the observations that must be preserved when training models and making predictions. Generally, prediction problems that involve sequence data are referred to as sequence prediction problems, although there are a suite of problems that differ based on the […]

Continue Reading 14