How to Load and Visualize Standard Computer Vision Datasets With Keras

By Jason Brownlee on July 5, 2019 in Deep Learning for Computer Vision 14

It can be convenient to use a standard computer vision dataset when getting started with deep learning methods for computer vision.

Standard datasets are often well understood, small, and easy to load. They can provide the basis for testing techniques and reproducing results in order to build confidence with libraries and methods.

In this tutorial, you will discover the standard computer vision datasets provided with the Keras deep learning library.

After completing this tutorial, you will know:

The API and idioms for downloading standard computer vision datasets using Keras.
The structure, nature, and top results for the MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100 computer vision datasets.
How to load and visualize standard computer vision datasets using the Keras API.

Kick-start your project with my new book Deep Learning for Computer Vision, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

How to Load and Visualize Standard Computer Vision Datasets With Keras
Photo by Marina del Castell, some rights reserved.

Tutorial Overview

This tutorial is divided into five parts; they are:

Keras Computer Vision Datasets
MNIST Dataset
Fashion-MNIST Dataset
CIFAR-10 Dataset
CIFAR-100 Dataset

Keras Computer Vision Datasets

The Keras deep learning library provides access to four standard computer vision datasets.

This is particularly helpful as it allows you to rapidly start testing model architectures and configurations for computer vision.

Four specific multi-class image classification dataset are provided; they are:

MNIST: Classify photos of handwritten digits (10 classes).
Fashion-MNIST: Classify photos of items of clothing (10 classes).
CIFAR-10: Classify small photos of objects (10 classes).
CIFAR-100: Classify small photos of common objects (100 classes).

The datasets are available under the keras.datasets module via dataset-specific load functions.

After a call to the load function, the dataset is downloaded to your workstation and stored in the ~/.keras directory under a “datasets” subdirectory. The datasets are stored in a compressed format, but may also include additional metadata.

After the first call to a dataset-specific load function and the dataset is downloaded, the dataset does not need to be downloaded again. Subsequent calls will load the dataset immediately from disk.

The load functions return two tuples, the first containing the input and output elements for samples in the training dataset, and the second containing the input and output elements for samples in the test dataset. The splits between train and test datasets often follow a standard split, used when benchmarking algorithms on the dataset.

The standard idiom for loading the datasets is as follows:

...
# load dataset
(trainX, trainy), (testX, testy) = load_data()

...

# load dataset

(trainX, trainy), (testX, testy) = load_data()

Each of the train and test X and y elements are NumPy arrays of pixel or class values respectively.

Two of the datasets contain grayscale images and two contain color images. The shape of the grayscale images must be converted from two-dimensional to three-dimensional arrays to match the preferred channel ordering of Keras. For example:

# reshape grayscale images to have a single channel
width, height, channels = trainX.shape[1], trainX.shape[2], 1
trainX = trainX.reshape((trainX.shape[0], width, height, channels))
testX = testX.reshape((testX.shape[0], width, height, channels))

# reshape grayscale images to have a single channel

width, height, channels = trainX.shape[1], trainX.shape[2], 1

trainX = trainX.reshape((trainX.shape[0], width, height, channels))

testX = testX.reshape((testX.shape[0], width, height, channels))

Both grayscale and color image pixel data are stored as unsigned integer values with values between 0 and 255.

Before modeling, the image data will need to be rescaled, e.g. such as normalization to the range 0-1 and perhaps further standardized. For example:

# normalize pixel values
trainX = trainX.astype('float32') / 255
testX = testX.astype('float32') / 255

# normalize pixel values

trainX = trainX.astype('float32') / 255

testX = testX.astype('float32') / 255

The output elements of each sample (y) are stored as class integer values. Each problem is a multi-class classification problem (more than two classes); as such, it is common practice to one hot encode the class values prior to modeling. This can be achieved using the to_categorical() function provided by Keras; for example:

...
# one hot encode target values
trainy = to_categorical(trainy)
testy = to_categorical(testy)

...

# one hot encode target values

trainy = to_categorical(trainy)

testy = to_categorical(testy)

Now that we are familiar with the idioms for working with the standard computer vision datasets provided by Keras, let’s take a closer look at each dataset in turn.

Note, the examples in this tutorial assume that you have internet access and may download the datasets the first time each example is run on your system. The download speed will depend on the speed of your internet connection and you are recommended to run the examples from the command line.

Want Results with Deep Learning for Computer Vision?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

MNIST Dataset

The MNIST dataset is an acronym that stands for the Modified National Institute of Standards and Technology dataset.

It is a dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9.

The task is to classify a given image of a handwritten digit into one of 10 classes representing integer values from 0 to 9, inclusively.

It is a widely used and deeply understood dataset, and for the most part, is “solved.” Top-performing models are deep learning convolutional neural networks that achieve a classification accuracy of above 99%, with an error rate between 0.4 %and 0.2% on the holdout test dataset.

For a step-by-step tutorial on developing a model for MNIST, see:

How to Develop a Deep CNN for MNIST Digit Classification

The example below loads the MNIST dataset using the Keras API and creates a plot of the first 9 images in the training dataset.

# example of loading the mnist dataset
from keras.datasets import mnist
from matplotlib import pyplot
# load dataset
(trainX, trainy), (testX, testy) = mnist.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
# plot first few images
for i in range(9):
	# define subplot
	pyplot.subplot(330 + 1 + i)
	# plot raw pixel data
	pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))
# show the figure
pyplot.show()

# example of loading the mnist dataset

from keras.datasets import mnist

from matplotlib import pyplot

# load dataset

(trainX, trainy), (testX, testy) = mnist.load_data()

# summarize loaded dataset

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))

print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

# plot first few images

for i in range(9):

# define subplot

pyplot.subplot(330 + 1 + i)

# plot raw pixel data

pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))

# show the figure

pyplot.show()

Running the example loads the MNIST train and test dataset and prints their shape.

We can see that there are 60,000 examples in the training dataset and 10,000 in the test dataset and that images are indeed square with 28×28 pixels.

Train: X=(60000, 28, 28), y=(60000,)
Test: X=(10000, 28, 28), y=(10000,)

1 2	Train: X=(60000, 28, 28), y=(60000,) Test: X=(10000, 28, 28), y=(10000,)

A plot of the first nine images in the dataset is also created showing the natural handwritten nature of the images to be classified.

Plot of a Subset of Images From the MNIST Dataset

Fashion-MNIST Dataset

The Fashion-MNIST is proposed as a more challenging replacement dataset for the MNIST dataset.

It is a dataset comprised of 60,000 small square 28×28 pixel grayscale images of items of 10 types of clothing, such as shoes, t-shirts, dresses, and more.

It is a more challenging classification problem than MNIST and top results are achieved by deep learning convolutional networks with a classification accuracy of about 95% to 96% on the holdout test dataset.

For a step-by-step tutorial on developing a model for Fashion-MNIST, see:

How to Develop a Deep CNN for Fashion MNIST Clothing Classification

The example below loads the Fashion-MNIST dataset using the Keras API and creates a plot of the first nine images in the training dataset.

# example of loading the fashion mnist dataset
from matplotlib import pyplot
from keras.datasets import fashion_mnist
# load dataset
(trainX, trainy), (testX, testy) = fashion_mnist.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
# plot first few images
for i in range(9):
	# define subplot
	pyplot.subplot(330 + 1 + i)
	# plot raw pixel data
	pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))
# show the figure
pyplot.show()

# example of loading the fashion mnist dataset

from matplotlib import pyplot

from keras.datasets import fashion_mnist

# load dataset

(trainX, trainy), (testX, testy) = fashion_mnist.load_data()

# summarize loaded dataset

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))

print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

# plot first few images

for i in range(9):

# define subplot

pyplot.subplot(330 + 1 + i)

# plot raw pixel data

pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))

# show the figure

pyplot.show()

Running the example loads the Fashion-MNIST train and test dataset and prints their shape.

We can see that there are 60,000 examples in the training dataset and 10,000 in the test dataset and that images are indeed square with 28×28 pixels.

Train: X=(60000, 28, 28), y=(60000,)
Test: X=(10000, 28, 28), y=(10000,)

1 2	Train: X=(60000, 28, 28), y=(60000,) Test: X=(10000, 28, 28), y=(10000,)

A plot of the first nine images in the dataset is also created, showing that indeed the images are grayscale photographs of items of clothing.

Plot of a Subset of Images From the Fashion-MNIST Dataset

CIFAR-10 Dataset

CIFAR is an acronym that stands for the Canadian Institute For Advanced Research and the CIFAR-10 dataset was developed along with the CIFAR-100 dataset (covered in the next section) by researchers at the CIFAR institute.

The dataset is comprised of 60,000 32×32 pixel color photographs of objects from 10 classes, such as frogs, birds, cats, ships, etc.

These are very small images, much smaller than a typical photograph, and the dataset is intended for computer vision research.

CIFAR-10 is a dataset and was widely used for benchmarking computer vision algorithms in the field of machine learning. The problem is “solved.” Top performance on the problem is achieved by deep learning convolutional neural networks with a classification accuracy above 96% or 97% on the test dataset.

The example below loads the CIFAR-10 dataset using the Keras API and creates a plot of the first nine images in the training dataset.

# example of loading the cifar10 dataset
from matplotlib import pyplot
from keras.datasets import cifar10
# load dataset
(trainX, trainy), (testX, testy) = cifar10.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
# plot first few images
for i in range(9):
	# define subplot
	pyplot.subplot(330 + 1 + i)
	# plot raw pixel data
	pyplot.imshow(trainX[i])
# show the figure
pyplot.show()

# example of loading the cifar10 dataset

from matplotlib import pyplot

from keras.datasets import cifar10

# load dataset

(trainX, trainy), (testX, testy) = cifar10.load_data()

# summarize loaded dataset

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))

print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

# plot first few images

for i in range(9):

# define subplot

pyplot.subplot(330 + 1 + i)

# plot raw pixel data

pyplot.imshow(trainX[i])

# show the figure

pyplot.show()

Running the example loads the CIFAR-10 train and test dataset and prints their shape.

We can see that there are 50,000 examples in the training dataset and 10,000 in the test dataset and that images are indeed square with 32×32 pixels and color, with three channels.

Train: X=(50000, 32, 32, 3), y=(50000, 1)
Test: X=(10000, 32, 32, 3), y=(10000, 1)

1 2	Train: X=(50000, 32, 32, 3), y=(50000, 1) Test: X=(10000, 32, 32, 3), y=(10000, 1)

A plot of the first nine images in the dataset is also created. It is clear that the images are indeed very small compared to modern photographs; it can be challenging to see what exactly is represented in some of the images given the extremely low resolution.

This low resolution is likely the cause of the limited performance that top-of-the-line algorithms are able to achieve on the dataset.

Plot of a Subset of Images From the CIFAR-10 Dataset

CIFAR-100 Dataset

The CIFAR-100 dataset was prepared along with the CIFAR-10 dataset by academics at the Canadian Institute For Advanced Research (CIFAR).

The dataset is comprised of 60,000 32×32 pixel color photographs of objects from 100 classes, such as fish, flowers, insects, and much more.

Like CIFAR-10, the images are intentionally small and unrealistic photographs and the dataset is intended for computer vision research.

The example below loads the CIFAR-100 dataset using the Keras API and creates a plot of the first nine images in the training dataset.

# example of loading the cifar100 dataset
from matplotlib import pyplot
from keras.datasets import cifar100
# load dataset
(trainX, trainy), (testX, testy) = cifar100.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
# plot first few images
for i in range(9):
	# define subplot
	pyplot.subplot(330 + 1 + i)
	# plot raw pixel data
	pyplot.imshow(trainX[i])
# show the figure
pyplot.show()

# example of loading the cifar100 dataset

from matplotlib import pyplot

from keras.datasets import cifar100

# load dataset

(trainX, trainy), (testX, testy) = cifar100.load_data()

# summarize loaded dataset

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))

print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

# plot first few images

for i in range(9):

# define subplot

pyplot.subplot(330 + 1 + i)

# plot raw pixel data

pyplot.imshow(trainX[i])

# show the figure

pyplot.show()

Running the example loads the CIFAR-100 train and test dataset and prints their shape.

We can see that there are 50,000 examples in the training dataset and 10,000 in the test dataset and that images are indeed square with 32×32 pixels and color, with three channels.

Train: X=(50000, 32, 32, 3), y=(50000, 1)
Test: X=(10000, 32, 32, 3), y=(10000, 1)

1 2	Train: X=(50000, 32, 32, 3), y=(50000, 1) Test: X=(10000, 32, 32, 3), y=(10000, 1)

A plot of the first nine images in the dataset is also created, and like CIFAR-10, the low resolution of the images can make it challenging to clearly see what is present in some photos.

Plot of a Subset of Images From the CIFAR-100 Dataset

Although there are images organized into 100 classes, the 100 classes are organized into 20 super-classes, e.g. groups of common classes.

Keras will return labels for 100 classes by default, although labels can be retrieved by setting the “label_mode” argument to “coarse” (instead of the default “fine“) when calling the load_data() function. For example:

# load coarse labels
(trainX, trainy), (testX, testy) = cifar100.load_data(label_mode='coarse')

1 2	# load coarse labels (trainX, trainy), (testX, testy) = cifar100.load_data(label_mode='coarse')

The difference is made clear when the labels are one hot encoded using the to_categorical() function, where instead of each output vector having 100 dimensions, it will only have 20. The example below demonstrates this by loading the dataset with course labels and encoding the class labels.

# example of loading the cifar100 dataset with coarse labels
from keras.datasets import cifar100
from keras.utils import to_categorical
# load coarse labels
(trainX, trainy), (testX, testy) = cifar100.load_data(label_mode='coarse')
# one hot encode target values
trainy = to_categorical(trainy)
testy = to_categorical(testy)
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

# example of loading the cifar100 dataset with coarse labels

from keras.datasets import cifar100

from keras.utils import to_categorical

# load coarse labels

(trainX, trainy), (testX, testy) = cifar100.load_data(label_mode='coarse')

# one hot encode target values

trainy = to_categorical(trainy)

testy = to_categorical(testy)

# summarize loaded dataset

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))

print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

Running the example loads the CIFAR-100 dataset as before, but images are now classified as belonging to one of the twenty super-classes.

The class labels are one hot encoded and we can see that each label is represented by a twenty element vector instead of a 100 element vector we would expect for the fine class labels.

Train: X=(50000, 32, 32, 3), y=(50000, 20)
Test: X=(10000, 32, 32, 3), y=(10000, 20)

1 2	Train: X=(50000, 32, 32, 3), y=(50000, 20) Test: X=(10000, 32, 32, 3), y=(10000, 20)

Summary

In this tutorial, you discovered the standard computer vision datasets provided with the Keras deep learning library.

Specifically, you learned:

The API and idioms for downloading standard computer vision datasets using Keras.
The structure, nature, and top results for the MNIST, Fashion-MNIST, CIFAR-10 and CIFAR-100 computer vision datasets.
How to load and visualize standard computer vision datasets using the Keras API.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

14 Responses to How to Load and Visualize Standard Computer Vision Datasets With Keras

Tanvi Singh May 17, 2020 at 4:14 pm #

While loading the fashion_mnist I get the following error
in
1 import tensorflow as tf
—-> 2 x_train, y_train, x_test, y_test = tf.keras.datasets.fashion_mnist.load_data()

c:\users\dell\appdata\local\programs\python\python38\lib\site-packages\tensorflow\python\keras\datasets\fashion_mnist.py in load_data()
76
77 with gzip.open(paths[0], ‘rb’) as lbpath:
—> 78 y_train = np.frombuffer(lbpath.read(), np.uint8, offset=8)
79
80 with gzip.open(paths[1], ‘rb’) as imgpath:

c:\users\dell\appdata\local\programs\python\python38\lib\gzip.py in read(self, size)
290 import errno
291 raise OSError(errno.EBADF, “read() on write-only GzipFile object”)
–> 292 return self._buffer.read(size)
293
294 def read1(self, size=-1):

c:\users\dell\appdata\local\programs\python\python38\lib\gzip.py in read(self, size)
485 buf = self._fp.read(io.DEFAULT_BUFFER_SIZE)
486
–> 487 uncompress = self._decompressor.decompress(buf, size)
488 if self._decompressor.unconsumed_tail != b””:
489 self._fp.prepend(self._decompressor.unconsumed_tail)

error: Error -3 while decompressing data: invalid distance too far back
Please help how to resolve it.

Reply
- Jason Brownlee May 18, 2020 at 6:09 am #
  
  Sorry to hear that, I have not seen this problem before.
  
  Perhaps try re-installing the library?
  Perhaps try deleting the ~/.keras/datasets directory and try again?
  
  Reply
Hemanth January 7, 2021 at 12:25 am #

hi Jason,
how to know the class names(categorical names like fish, horse, etc) of built-in datasets?

Reply
- Jason Brownlee January 7, 2021 at 6:19 am #
  
  Good question, keras may provide a function to interpret the labels – it does for the cifar datasets. Otherwise you can check the literature where the dataset was first used.
  
  Reply
radhika sharma February 3, 2021 at 3:37 am #

Hello Jason. Thank you for write up. I have a question. After training the model, when I evaluated the model, I got 83.82% percent of Accuracy. How to find out for what images the model did not evaluate accurately. I am very new to this and trying to do some learning. any inputs will help. So what I mean is that I trained my model on half of the data and using other half for testing. How do I know for what images the model did not correctly categorized the images?
Also, how do I test the model with other image that I am providing?

Please advise.

Reply
- Jason Brownlee February 3, 2021 at 6:24 am #
  
  You can call the predict() function, evaluate the predictions, then inspect each input with each output to see what the image was, what prediction was made, and what prediction was expected.
  
  If you are new to making predictions, start here:
  https://machinelearningmastery.com/how-to-make-classification-and-regression-predictions-for-deep-learning-models-in-keras/
  
  If you are new relating inputs to outputs, this will help:
  https://machinelearningmastery.com/how-to-connect-model-input-data-with-predictions-for-machine-learning/
  
  Reply
Daniel Z February 20, 2021 at 9:51 pm #

Hi Jason, I am trying to visualize fashion mnist dataset. This line gives error “Invalid shape (784,) for image data” Could you advise where I went wrong?

pyplot.imshow(trainX[i], cmap=pyplot.get_cmap(‘gray’))

Reply
- Jason Brownlee February 21, 2021 at 6:11 am #
  
  Perhaps check that you coped the code exactly, this will help:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-copy-code-from-a-tutorial
  
  Reply
Anna July 26, 2021 at 4:48 am #

And when we are dealing with time series Data . What do you suggest please

Reply
- Jason Brownlee July 26, 2021 at 5:32 am #
  
  See this:
  https://machinelearningmastery.com/time-series-datasets-for-machine-learning/
  
  Reply
Manoj Kumar December 26, 2021 at 7:00 am #

Hi Jason, I want to load collected image data and train CNN on that data. How to do it?

Reply
- James Carmichael January 10, 2022 at 11:29 am #
  
  Hello Manoj…the following may be of interest to you:
  
  https://machinelearningmastery.com/best-practices-for-preparing-and-augmenting-image-data-for-convolutional-neural-networks/
  
  Reply
Hermite Dorvil March 24, 2023 at 3:56 am #

Hi all,

I would like to know where the 33+1+i come from in order to get the plot — [pyplot.subplot(330 + 1 + i)] ?

Reply
- James Carmichael March 24, 2023 at 6:13 am #
  
  Hi Hermite…The following resource may be of interest to you:
  
  https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.subplot.html
  
  Reply

Navigation

How to Load and Visualize Standard Computer Vision Datasets With Keras

Tutorial Overview

Keras Computer Vision Datasets

Want Results with Deep Learning for Computer Vision?

MNIST Dataset

Fashion-MNIST Dataset

CIFAR-10 Dataset

CIFAR-100 Dataset

Further Reading

APIs

Articles

Summary

Develop Deep Learning Models for Vision Today!

Develop Your Own Vision Models in Minutes

Finally Bring Deep Learning to your Vision Projects

More On This Topic

14 Responses to How to Load and Visualize Standard Computer Vision Datasets With Keras

Leave a Reply Click here to cancel reply.