Archive | Deep Learning for Computer Vision

Architecture of the Residual Network for Object Photo Classification

Convolutional Neural Network Model Innovations for Image Classification

A Gentle Introduction to the Innovations in LeNet, AlexNet, VGG, Inception, and ResNet Convolutional Neural Networks. Convolutional neural networks are comprised of two very simple elements, namely convolutional layers and pooling layers. Although simple, there are near-infinite ways to arrange these layers for a given computer vision problem. Fortunately, there are both common patterns for […]

Continue Reading 20
A Gentle Introduction to Padding and Stride for Convolutional Neural Networks

A Gentle Introduction to Padding and Stride for Convolutional Neural Networks

The convolutional layer in convolutional neural networks systematically applies filters to an input and creates output feature maps. Although the convolutional layer is very simple, it is capable of achieving sophisticated and impressive results. Nevertheless, it can be challenging to develop an intuition for how the shape of the filters impacts the shape of the […]

Continue Reading 10
A Gentle Introduction to Convolutional Layers for Deep Learning Neural Networks

How Do Convolutional Layers Work in Deep Learning Neural Networks?

Convolutional layers are the major building blocks used in convolutional neural networks. A convolution is the simple application of a filter to an input that results in an activation. Repeated application of the same filter to an input results in a map of activations called a feature map, indicating the locations and strength of a […]

Continue Reading 38
How to Use Test-Time Augmentation to Improve Model Performance for Image Classification

How to Use Test-Time Augmentation to Make Better Predictions

Data augmentation is a technique often used to improve performance and reduce generalization error when training neural network models for computer vision problems. The image data augmentation technique can also be applied when making predictions with a fit model in order to allow the model to make predictions for multiple different versions of each image […]

Continue Reading 8
Red Car, by Dennis Jarvis

How to Load Large Datasets From Directories for Deep Learning in Keras

There are conventions for storing and structuring your image dataset on disk in order to make it fast and efficient to load and when training and evaluating deep learning models. Once structured, you can use tools like the ImageDataGenerator class in the Keras deep learning library to automatically load your train, test, and validation datasets. […]

Continue Reading 99
How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)

How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)

Deep Learning for Computer Vision Crash Course. Bring Deep Learning Methods to Your Computer Vision Project in 7 Days. We are awash in digital images from photos, videos, Instagram, YouTube, and increasingly live video streams. Working with image data is hard as it requires drawing upon knowledge from diverse domains such as digital signal processing, […]

Continue Reading 162
Plot of a Subset of Images From the Fashion-MNIST Dataset

How to Load and Visualize Standard Computer Vision Datasets With Keras

It can be convenient to use a standard computer vision dataset when getting started with deep learning methods for computer vision. Standard datasets are often well understood, small, and easy to load. They can provide the basis for testing techniques and reproducing results in order to build confidence with libraries and methods. In this tutorial, […]

Continue Reading 2
Phillip Island Penguin Parade

A Gentle Introduction to Channels-First and Channels-Last Image Formats

Color images have height, width, and color channel dimensions. When represented as three-dimensional arrays, the channel dimension for the image data is last by default, but may be moved to be the first dimension, often for performance-tuning reasons. The use of these two “channel ordering formats” and preparing data to meet a specific preferred channel […]

Continue Reading 22