encoder_cover

Implementing the Transformer Encoder From Scratch in TensorFlow and Keras

Having seen how to implement the scaled dot-product attention, and integrate it within the multi-head attention of the Transformer model, we may progress one step further towards implementing a complete Transformer model by implementing its encoder. Our end goal remains the application of the complete model to Natural Language Processing (NLP). In this tutorial, you […]

Continue Reading 0
vit_cover

The Vision Transformer Model

With the Transformer architecture revolutionizing the implementation of attention, and achieving very promising results in the natural language processing domain, it was only a matter of time before we could see its application in the computer vision domain too. This was eventually achieved with the implementation of the Vision Transformer (ViT).  In this tutorial, you […]

Continue Reading 0
multihead_cover

How to Implement Multi-Head Attention From Scratch in TensorFlow and Keras

We have already familiarised ourselves with the theory behind the Transformer model and its attention mechanism, and we have already started our journey of implementing a complete model by seeing how to implement the scaled-dot product attention. We shall now progress one step further into our journey by encapsulating the scaled-dot product attention into a […]

Continue Reading 0
dotproduct_cover

How to Implement Scaled Dot-Product Attention From Scratch in TensorFlow and Keras

Having familiarised ourselves with the theory behind the Transformer model and its attention mechanism, we’ll be starting our journey of implementing a complete Transformer model by first seeing how to implement the scaled-dot product attention. The scaled dot-product attention is an integral part of the multi-head attention, which in turn, is an important component of […]

Continue Reading 0

TransformX by Scale AI is Oct 19-21: Register for free!

Sponsored Post     📣 The AI event of the year is quickly approaching… We’re talking about TransformX, a FREE virtual conference where you’ll hear from 120+ technology leaders from companies like Google, Meta, OpenAI, DeepMind, Amazon, and more. Explore how AI will power ecommerce, AI applications for healthcare, NFT marketplaces and more. 🎙 Speakers […]

Continue Reading 1
transformer_cover

The Transformer Model

We have already familiarized ourselves with the concept of self-attention as implemented by the Transformer attention mechanism for neural machine translation. We will now be shifting our focus on the details of the Transformer architecture itself, to discover how self-attention can be implemented without relying on the use of recurrence and convolutions. In this tutorial, […]

Continue Reading 14
transformer_cover

The Transformer Attention Mechanism

Before the introduction of the Transformer model, the use of attention for neural machine translation was being implemented by RNN-based encoder-decoder architectures. The Transformer model revolutionized the implementation of attention by dispensing of recurrence and convolutions and, alternatively, relying solely on a self-attention mechanism.  We will first be focusing on the Transformer attention mechanism in […]

Continue Reading 9