Archive | Attention

transformer_cover

The Transformer Model

We have already familiarized ourselves with the concept of self-attention as implemented by the Transformer attention mechanism for neural machine translation. We will now be shifting our focus to the details of the Transformer architecture itself to discover how self-attention can be implemented without relying on the use of recurrence and convolutions. In this tutorial, […]

Continue Reading 16
transformer_cover

The Transformer Attention Mechanism

Before the introduction of the Transformer model, the use of attention for neural machine translation was implemented by RNN-based encoder-decoder architectures. The Transformer model revolutionized the implementation of attention by dispensing with recurrence and convolutions and, alternatively, relying solely on a self-attention mechanism.  We will first focus on the Transformer attention mechanism in this tutorial […]

Continue Reading 11
IMG_9527

An Introduction to Recurrent Neural Networks and the Math That Powers Them

When it comes to sequential or time series data, traditional feedforward networks cannot be used for learning and prediction. A mechanism is required to retain past or historical information to forecast future values. Recurrent neural networks, or RNNs for short, are a variant of the conventional feedforward artificial neural networks that can deal with sequential […]

Continue Reading 7
luong_cover

The Luong Attention Mechanism

The Luong attention sought to introduce several improvements over the Bahdanau model for neural machine translation, notably by introducing two new classes of attentional mechanisms: a global approach that attends to all source words and a local approach that only attends to a selected subset of words in predicting the target sentence.  In this tutorial, […]

Continue Reading 7
bahdanau_cover

The Bahdanau Attention Mechanism

Conventional encoder-decoder architectures for machine translation encoded every source sentence into a fixed-length vector, regardless of its length, from which the decoder would then generate a translation. This made it difficult for the neural network to cope with long sentences, essentially resulting in a performance bottleneck.  The Bahdanau attention was proposed to address the performance […]

Continue Reading 2
yahya-ehsan-L895sqROaGw-unsplash

Adding a Custom Attention Layer to a Recurrent Neural Network in Keras

Deep learning networks have gained immense popularity in the past few years. The “attention mechanism” is integrated with deep learning networks to improve their performance. Adding an attention component to the network has shown significant improvement in tasks such as machine translation, image recognition, text summarization, and similar applications. This tutorial shows how to add […]

Continue Reading 43
tour_cover

A Tour of Attention-Based Architectures

As the popularity of attention in machine learning grows, so does the list of neural architectures that incorporate an attention mechanism. In this tutorial, you will discover the salient neural architectures that have been used in conjunction with attention. After completing this tutorial, you will better understand how the attention mechanism is incorporated into different […]

Continue Reading 4