ChatGPT is fun to play with. Chances are, you also want to have your own copy running privately. Realistically, that’s impossible because ChatGPT is not a software for download, and it needs tremendous computer power to run. But you can build a trimmed-down version that can run on commodity hardware. In this post, you will […]
Search results for "language models"
What Are Zero-Shot Prompting and Few-Shot Prompting
In the literature on language models, you will often encounter the terms “zero-shot prompting” and “few-shot prompting.” It is important to understand how a large language model generates an output. In this post, you will learn: What is zero-shot and few-shot prompting? How to experiment with them in GPT4All Let’s get started. Overview This post […]
Get a Taste of LLMs from GPT4All
Large language models have become popular recently. ChatGPT is fashionable. Trying out ChatGPT to understand what LLMs are about is easy, but sometimes, you may want an offline alternative that can run on your computer. In this post, you will learn about GPT4All as an LLM that you can install on your computer. In particular, […]
Practical Deep Learning for Coders (Review)
Practical deep learning is a challenging subject in which to get started. It is often taught in a bottom-up manner, requiring that you first get familiar with linear algebra, calculus, and mathematical optimization before eventually learning the neural network techniques. This can take years, and most of the background theory will not help you to […]
A Gentle Introduction to Transfer Learning for Deep Learning
Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task. It is a popular approach in deep learning where pre-trained models are used as the starting point on computer vision and natural language processing tasks given the vast […]
What is Teacher Forcing for Recurrent Neural Networks?
Teacher forcing is a method for quickly and efficiently training recurrent neural network models that use the ground truth from a prior time step as input. It is a network training method critical to the development of deep learning language models used in machine translation, text summarization, and image captioning, among many other applications. In […]
Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention
The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing. Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the learning and lifts the skill of the […]
What Are Word Embeddings for Text?
Word embeddings are a type of word representation that allows words with similar meaning to have a similar representation. They are a distributed representation for text that is perhaps one of the key breakthroughs for the impressive performance of deep learning methods on challenging natural language processing problems. In this post, you will discover the […]
Gentle Introduction to Generative Long Short-Term Memory Networks
The Long Short-Term Memory recurrent neural network was developed for sequence prediction. In addition to sequence prediction problems. LSTMs can also be used as a generative model In this post, you will discover how LSTMs can be used as generative models. After completing this post, you will know: About generative models, with a focus on […]
CNN Long Short-Term Memory Networks
Gentle introduction to CNN LSTM recurrent neural networks with example Python code. Input with spatial structure, like images, cannot be modeled easily with the standard Vanilla LSTM. The CNN Long Short-Term Memory Network or CNN LSTM for short is an LSTM architecture specifically designed for sequence prediction problems with spatial inputs, like images or videos. […]