Search results for "Recurrent Neural Network"

IMG_9527

An Introduction to Recurrent Neural Networks and the Math That Powers Them

When it comes to sequential or time series data, traditional feedforward networks cannot be used for learning and prediction. A mechanism is required to retain past or historical information to forecast future values. Recurrent neural networks, or RNNs for short, are a variant of the conventional feedforward artificial neural networks that can deal with sequential […]

Continue Reading
yahya-ehsan-L895sqROaGw-unsplash

Adding a Custom Attention Layer to a Recurrent Neural Network in Keras

Deep learning networks have gained immense popularity in the past few years. The “attention mechanism” is integrated with deep learning networks to improve their performance. Adding an attention component to the network has shown significant improvement in tasks such as machine translation, image recognition, text summarization, and similar applications. This tutorial shows how to add […]

Continue Reading
Encoder-Decoder Recurrent Neural Network Models for Neural Machine Translation

Encoder-Decoder Recurrent Neural Network Models for Neural Machine Translation

The encoder-decoder architecture for recurrent neural networks is the standard neural machine translation method that rivals and in some cases outperforms classical statistical machine translation methods. This architecture is very new, having only been pioneered in 2014, although, has been adopted as the core technology inside Google’s translate service. In this post, you will discover […]

Continue Reading
What is Teacher Forcing for Recurrent Neural Networks?

What is Teacher Forcing for Recurrent Neural Networks?

Teacher forcing is a method for quickly and efficiently training recurrent neural network models that use the ground truth from a prior time step as input. It is a network training method critical to the development of deep learning language models used in machine translation, text summarization, and image captioning, among many other applications. In […]

Continue Reading
Gentle Introduction to Global Attention for Encoder-Decoder Recurrent Neural Networks

Gentle Introduction to Global Attention for Encoder-Decoder Recurrent Neural Networks

The encoder-decoder model provides a pattern for using recurrent neural networks to address challenging sequence-to-sequence prediction problems such as machine translation. Attention is an extension to the encoder-decoder model that improves the performance of the approach on longer sequences. Global attention is a simplification of attention that may be easier to implement in declarative deep […]

Continue Reading
Feeding Hidden State as Input to Decoder

How Does Attention Work in Encoder-Decoder Recurrent Neural Networks

Attention is a mechanism that was developed to improve the performance of the Encoder-Decoder RNN on machine translation. In this tutorial, you will discover the attention mechanism for the Encoder-Decoder model. After completing this tutorial, you will know: About the Encoder-Decoder model and attention mechanism for machine translation. How to implement the attention mechanism step-by-step. […]

Continue Reading
Mini-Course on Long Short-Term Memory Recurrent Neural Networks with Keras

Mini-Course on Long Short-Term Memory Recurrent Neural Networks with Keras

Long Short-Term Memory (LSTM) recurrent neural networks are one of the most interesting types of deep learning at the moment. They have been used to demonstrate world-class results in complex problem domains such as language translation, automatic image captioning, and text generation. LSTMs are different to multilayer Perceptrons and convolutional neural networks in that they […]

Continue Reading
Attentional Interpretation of Words in the Input Document to the Output Summary

Attention in Long Short-Term Memory Recurrent Neural Networks

The Encoder-Decoder architecture is popular because it has demonstrated state-of-the-art results across a range of domains. A limitation of the architecture is that it encodes the input sequence to a fixed length internal representation. This imposes limits on the length of input sequences that can be reasonably learned and results in worse performance for very […]

Continue Reading