Deep learning neural networks are challenging to configure and train. There are decades of tips and tricks spread across hundreds of research papers, source code, and in the heads of academics and practitioners. The book “Neural Networks: Tricks of the Trade” originally published in 1998 and updated in 2012 at the cusp of the deep […]

# Search results for "embedding"

## How to Use Greedy Layer-Wise Pretraining in Deep Learning Neural Networks

Training deep neural networks was traditionally challenging as the vanishing gradient meant that weights in layers close to the input layer were not updated in response to errors calculated on the training dataset. An innovation and important milestone in the field of deep learning was greedy layer-wise pretraining that allowed very deep neural networks to […]

## Practical Deep Learning for Coders (Review)

Practical deep learning is a challenging subject in which to get started. It is often taught in a bottom-up manner, requiring that you first get familiar with linear algebra, calculus, and mathematical optimization before eventually learning the neural network techniques. This can take years, and most of the background theory will not help you to […]

## How to Develop a Multichannel CNN Model for Text Classification

A standard deep learning model for text classification and sentiment analysis uses a word embedding layer and one-dimensional convolutional neural network. The model can be expanded by using multiple parallel convolutional neural networks that read the source document using different kernel sizes. This, in effect, creates a multichannel convolutional neural network for text that reads […]

## How to Develop a Neural Machine Translation System from Scratch

Develop a Deep Learning Model to Automatically Translate from German to English in Python with Keras, Step-by-Step. Machine translation is a challenging task that traditionally involves large statistical models developed using highly sophisticated linguistic knowledge. Neural machine translation is the use of deep neural networks for the problem of machine translation. In this tutorial, you […]

## How to Configure an Encoder-Decoder Model for Neural Machine Translation

The encoder-decoder architecture for recurrent neural networks is achieving state-of-the-art results on standard machine translation benchmarks and is being used in the heart of industrial translation services. The model is simple, but given the large amount of data required to train it, tuning the myriad of design decisions in the model in order get top […]

## Encoder-Decoder Recurrent Neural Network Models for Neural Machine Translation

The encoder-decoder architecture for recurrent neural networks is the standard neural machine translation method that rivals and in some cases outperforms classical statistical machine translation methods. This architecture is very new, having only been pioneered in 2014, although, has been adopted as the core technology inside Google’s translate service. In this post, you will discover […]

## A Gentle Introduction to Transfer Learning for Deep Learning

Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task. It is a popular approach in deep learning where pre-trained models are used as the starting point on computer vision and natural language processing tasks given the vast […]

## Encoder-Decoder Models for Text Summarization in Keras

Text summarization is a problem in natural language processing of creating a short, accurate, and fluent summary of a source document. The Encoder-Decoder recurrent neural network architecture developed for machine translation has proven effective when applied to the problem of text summarization. It can be difficult to apply this architecture in the Keras deep learning […]

## Encoder-Decoder Deep Learning Models for Text Summarization

Text summarization is the task of creating short, accurate, and fluent summaries from larger text documents. Recently deep learning methods have proven effective at the abstractive approach to text summarization. In this post, you will discover three different models that build on top of the effective Encoder-Decoder architecture developed for sequence-to-sequence prediction in machine translation. […]