Search results for "language model"

Photo by <a href="https://unsplash.com/photos/Wwtq9Lvk_ZE">Joseph Chan</a>. Some rights reserved.

Save and Load Your PyTorch Models

By Adrian Tam on April 8, 2023 in Deep Learning with PyTorch 4

A deep learning model is a mathematical abstraction of data, in which a lot of parameters are involved. Training these parameters can take hours, days, and even weeks but afterward, you can make use of the result to apply on new data. This is called inference in machine learning. It is important to know how […]

Photo by <a href="https://unsplash.com/photos/puk-xEM9CyI">Priyanka Neve</a>. Some rights reserved.

Use PyTorch Deep Learning Models with scikit-learn

By Adrian Tam on April 8, 2023 in Deep Learning with PyTorch 10

The most popular deep learning libraries in Python for research and development are TensorFlow/Keras and PyTorch, due to their simplicity. The scikit-learn library, however, is the most popular library for general machine learning in Python. In this post, you will discover how to use deep learning models from PyTorch with the scikit-learn library in Python. […]

Building Transformer Models with Attention Crash Course. Build a Neural Machine Translator in 12 Days

By Adrian Tam on January 9, 2023 in Attention 70

Transformer is a recent breakthrough in neural machine translation. Natural languages are complicated. A word in one language can be translated into multiple words in another, depending on the context. But what exactly a context is, and how you can teach the computer to understand the context was a big problem to solve. The invention […]

Building Transformer Models with Attention

By dan brian on September 13, 2023 in

Building Transformer Models with Attention Implementing a Neural Machine Translator from Scratch in Keras …another NLP book? This one is different! Handling text and human language is a tedious job. Not only is a lot of data cleansing needed, but multiple levels of preprocessing are also required depending on the algorithm you apply. But unarguably, […]

Inferencing the Transformer Model

By Stefania Cristina on January 6, 2023 in Attention 11

We have seen how to train the Transformer model on a dataset of English and German sentence pairs and how to plot the training and validation loss curves to diagnose the model’s learning performance and decide at which epoch to run inference on the trained model. We are now ready to run inference on the […]

Plotting the Training and Validation Loss Curves for the Transformer Model

By Stefania Cristina on January 6, 2023 in Attention 7

We have previously seen how to train the Transformer model for neural machine translation. Before moving on to inferencing the trained model, let us first explore how to modify the training code slightly to be able to plot the training and validation loss curves that can be generated during the learning process. The training and […]

Training the Transformer Model

By Stefania Cristina on January 6, 2023 in Attention 44

We have put together the complete Transformer model, and now we are ready to train it for neural machine translation. We shall use a training dataset for this purpose, which contains short English and German sentence pairs. We will also revisit the role of masking in computing the accuracy and loss metrics during the training […]

The Vision Transformer Model

By Stefania Cristina on January 6, 2023 in Attention 5

With the Transformer architecture revolutionizing the implementation of attention, and achieving very promising results in the natural language processing domain, it was only a matter of time before we could see its application in the computer vision domain too. This was eventually achieved with the implementation of the Vision Transformer (ViT). In this tutorial, you […]

muhammad-murtaza-ghani-CIVbJZR8aAk-unsplash

A Gentle Introduction to Positional Encoding in Transformer Models, Part 1

By Mehreen Saeed on January 6, 2023 in Attention 43

In languages, the order of the words and their position in a sentence really matters. The meaning of the entire sentence can change if the words are re-ordered. When implementing NLP solutions, recurrent neural networks have an inbuilt mechanism that deals with the order of sequences. The transformer model, however, does not use recurrence or […]

Some Language Features in Python

By Adrian Tam on June 21, 2022 in Python for Machine Learning 14

The Python language syntax is quite powerful and expressive. Hence it is concise to express an algorithm in Python. Maybe this is the reason why it is popular in machine learning, as we need to experiment a lot in developing a machine learning model. If you’re new to Python but with experience in another programming […]

← Previous 1 2 3 … 30 Next →