Search results for "language models"

Building Your mini-ChatGPT at Home

By Adrian Tam on July 24, 2023 in ChatGPT 7

ChatGPT is fun to play with. Chances are, you also want to have your own copy running privately. Realistically, that’s impossible because ChatGPT is not a software for download, and it needs tremendous computer power to run. But you can build a trimmed-down version that can run on commodity hardware. In this post, you will […]

What Are Zero-Shot Prompting and Few-Shot Prompting

By Adrian Tam on July 20, 2023 in ChatGPT 3

In the literature on language models, you will often encounter the terms “zero-shot prompting” and “few-shot prompting.” It is important to understand how a large language model generates an output. In this post, you will learn: What is zero-shot and few-shot prompting? How to experiment with them in GPT4All Let’s get started. Overview This post […]

Get a Taste of LLMs from GPT4All

By Adrian Tam on October 11, 2023 in ChatGPT 31

Large language models have become popular recently. ChatGPT is fashionable. Trying out ChatGPT to understand what LLMs are about is easy, but sometimes, you may want an offline alternative that can run on your computer. In this post, you will learn about GPT4All as an LLM that you can install on your computer. In particular, […]

Practical Deep Learning for Coders (Review)

By Jason Brownlee on November 1, 2019 in Deep Learning 26

Practical deep learning is a challenging subject in which to get started. It is often taught in a bottom-up manner, requiring that you first get familiar with linear algebra, calculus, and mathematical optimization before eventually learning the neural network techniques. This can take years, and most of the background theory will not help you to […]

A Gentle Introduction to Transfer Learning with Deep Learning

A Gentle Introduction to Transfer Learning for Deep Learning

By Jason Brownlee on September 16, 2019 in Deep Learning for Computer Vision 164

Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task. It is a popular approach in deep learning where pre-trained models are used as the starting point on computer vision and natural language processing tasks given the vast […]

What is Teacher Forcing for Recurrent Neural Networks?

By Jason Brownlee on April 8, 2021 in Long Short-Term Memory Networks 51

Teacher forcing is a method for quickly and efficiently training recurrent neural network models that use the ground truth from a prior time step as input. It is a network training method critical to the development of deep learning language models used in machine translation, text summarization, and image captioning, among many other applications. In […]

Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 6

The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing. Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the learning and lifts the skill of the […]

What Are Word Embeddings for Text?

By Jason Brownlee on August 7, 2019 in Deep Learning for Natural Language Processing 91

Word embeddings are a type of word representation that allows words with similar meaning to have a similar representation. They are a distributed representation for text that is perhaps one of the key breakthroughs for the impressive performance of deep learning methods on challenging natural language processing problems. In this post, you will discover the […]

Example of LSTMs used in Automatic Handwriting Generation

Gentle Introduction to Generative Long Short-Term Memory Networks

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 0

The Long Short-Term Memory recurrent neural network was developed for sequence prediction. In addition to sequence prediction problems. LSTMs can also be used as a generative model In this post, you will discover how LSTMs can be used as generative models. After completing this post, you will know: About generative models, with a focus on […]

Convolutional Neural Network Long Short-Term Memory Networks

CNN Long Short-Term Memory Networks

By Jason Brownlee on August 14, 2019 in Long Short-Term Memory Networks 260

Gentle introduction to CNN LSTM recurrent neural networks with example Python code. Input with spatial structure, like images, cannot be modeled easily with the standard Vanilla LSTM. The CNN Long Short-Term Memory Network or CNN LSTM for short is an LSTM architecture specifically designed for sequence prediction problems with spatial inputs, like images or videos. […]

← Previous 1 … 3 4 5 … 19 Next →