Object Classification with CNNs Using the Keras Deep Learning Library

By Jason Brownlee on August 7, 2022 in Deep Learning 188

Keras is a Python library for deep learning that wraps the powerful numerical libraries Theano and TensorFlow.

A difficult problem where traditional neural networks fall down is called object recognition. It is where a model is able to identify the objects in images.

In this post, you will discover how to develop and evaluate deep learning models for object recognition in Keras. After completing this tutorial, you will know:

About the CIFAR-10 object classification dataset and how to load and use it in Keras
How to create a simple Convolutional Neural Network for object recognition
How to lift performance by creating deeper Convolutional Neural Networks

Kick-start your project with my new book Deep Learning With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Jul/2016: First published
Update Oct/2016: Updated for Keras 1.1.0 and TensorFlow 0.10.0.
Update Mar/2017: Updated for Keras 2.0.2, TensorFlow 1.0.1 and Theano 0.9.0.
Update Sep/2019: Updated or Keras 2.2.5 API.
Update Jul/2022: Updated for TensorFlow 2.x API

For an extended tutorial on developing a CNN for CIFAR-10, see the post:

How to Develop a CNN From Scratch for CIFAR-10 Photo Classification

The CIFAR-10 Problem Description

The problem of automatically classifying photographs of objects is difficult because of the nearly infinite number of permutations of objects, positions, lighting, and so on. It’s a tough problem.

This is a well-studied problem in computer vision and, more recently, an important demonstration of the capability of deep learning. A standard computer vision and deep learning dataset for this problem was developed by the Canadian Institute for Advanced Research (CIFAR).

The CIFAR-10 dataset consists of 60,000 photos divided into 10 classes (hence the name CIFAR-10). Classes include common objects such as airplanes, automobiles, birds, cats, and so on. The dataset is split in a standard way, where 50,000 images are used for training a model and the remaining 10,000 for evaluating its performance.

The photos are in color with red, green, and blue components but are small, measuring 32 by 32 pixel squares.

State-of-the-art results are achieved using very large convolutional neural networks. You can learn about state-of-the-art results on CIFAR-10 on Rodrigo Benenson’s webpage. Model performance is reported in classification accuracy, with very good performance above 90%, with human performance on the problem at 94% and state-of-the-art results at 96% at the time of writing.

There is a Kaggle competition that makes use of the CIFAR-10 dataset. It is a good place to join the discussion of developing new models for the problem and picking up models and scripts as a starting point.

Need help with Deep Learning in Python?

Take my free 2-week email course and discover MLPs, CNNs and LSTMs (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Loading The CIFAR-10 Dataset in Keras

The CIFAR-10 dataset can easily be loaded in Keras.

Keras has the facility to automatically download standard datasets like CIFAR-10 and store them in the ~/.keras/datasets directory using the cifar10.load_data() function. This dataset is large at 163 megabytes, so it may take a few minutes to download.

Once downloaded, subsequent calls to the function will load the dataset ready for use.

The dataset is stored as pickled training and test sets, ready for use in Keras. Each image is represented as a three-dimensional matrix, with dimensions for red, green, blue, width, and height. We can plot images directly using matplotlib.

# Plot ad hoc CIFAR10 instances
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt
# load data
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
# create a grid of 3x3 images
for i in range(0, 9):
	plt.subplot(330 + 1 + i)
	plt.imshow(X_train[i])
# show the plot
plt.show()

# Plot ad hoc CIFAR10 instances

from tensorflow.keras.datasets import cifar10

import matplotlib.pyplot as plt

# load data

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# create a grid of 3x3 images

for i in range(0, 9):

plt.subplot(330 + 1 + i)

plt.imshow(X_train[i])

# show the plot

plt.show()

Running the code creates a 3×3 plot of photographs. The images have been scaled up from their small 32×32 size, but you can clearly see trucks, horses, and cars. You can also see some distortion in some images that have been forced to the square aspect ratio.

Small sample of CIFAR-10 images

Simple Convolutional Neural Network for CIFAR-10

The CIFAR-10 problem is best solved using a convolutional neural network (CNN).

You can quickly start by defining all the classes and functions you will need in this example.

# Simple CNN model for CIFAR-10
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.constraints import MaxNorm
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.utils import to_categorical
...

# Simple CNN model for CIFAR-10

from tensorflow.keras.datasets import cifar10

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

from tensorflow.keras.layers import Dropout

from tensorflow.keras.layers import Flatten

from tensorflow.keras.constraints import MaxNorm

from tensorflow.keras.optimizers import SGD

from tensorflow.keras.layers import Conv2D

from tensorflow.keras.layers import MaxPooling2D

from tensorflow.keras.utils import to_categorical

...

Next, you can load the CIFAR-10 dataset.

...
# load data
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

...

# load data

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

The pixel values range from 0 to 255 for each of the red, green, and blue channels.

It is good practice to work with normalized data. Because the input values are well understood, you can easily normalize to the range 0 to 1 by dividing each value by the maximum observation, which is 255.

Note that the data is loaded as integers, so you must cast it to floating point values in order to perform the division.

...
# normalize inputs from 0-255 to 0.0-1.0
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train = X_train / 255.0
X_test = X_test / 255.0

...

# normalize inputs from 0-255 to 0.0-1.0

X_train = X_train.astype('float32')

X_test = X_test.astype('float32')

X_train = X_train / 255.0

X_test = X_test / 255.0

The output variables are defined as a vector of integers from 0 to 1 for each class.

You can use a one-hot encoding to transform them into a binary matrix to best model the classification problem. There are ten classes for this problem, so you can expect the binary matrix to have a width of 10.

...
# one hot encode outputs
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
num_classes = y_test.shape[1]

...

# one hot encode outputs

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)

num_classes = y_test.shape[1]

Let’s start by defining a simple CNN structure as a baseline and evaluate how well it performs on the problem.

You will use a structure with two convolutional layers followed by max pooling and a flattening out of the network to fully connected layers to make predictions.

The baseline network structure can be summarized as follows:

Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function, and a weight constraint of max norm set to 3
Dropout set to 20%
Convolutional layer, 32 feature maps with a size of 3×3, a rectifier activation function, and a weight constraint of max norm set to 3
Max Pool layer with size 2×2
Flatten layer
Fully connected layer with 512 units and a rectifier activation function
Dropout set to 50%
Fully connected output layer with 10 units and a softmax activation function

A logarithmic loss function is used with the stochastic gradient descent optimization algorithm configured with a large momentum and weight decay start with a learning rate of 0.01.

...
# Create the model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), padding='same', activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', kernel_constraint=MaxNorm(3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
epochs = 25
lrate = 0.01
decay = lrate/epochs
sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
print(model.summary())

...

# Create the model

model = Sequential()

model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), padding='same', activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.2))

model.add(Conv2D(32, (3, 3), activation='relu', padding='same', kernel_constraint=MaxNorm(3)))

model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())

model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.5))

model.add(Dense(num_classes, activation='softmax'))

# Compile model

epochs = 25

lrate = 0.01

decay = lrate/epochs

sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)

model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

print(model.summary())

You can fit this model with 25 epochs and a batch size of 32.

A small number of epochs was chosen to help keep this tutorial moving. Usually, the number of epochs would be one or two orders of magnitude larger for this problem.

Once the model is fit, you evaluate it on the test dataset and print out the classification accuracy.

...
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=32)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

...

# Fit the model

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=32)

# Final evaluation of the model

scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

Tying this all together, the complete example is listed below.

# Simple CNN model for the CIFAR-10 Dataset
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.constraints import MaxNorm
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.utils import to_categorical
# load data
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
# normalize inputs from 0-255 to 0.0-1.0
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train = X_train / 255.0
X_test = X_test / 255.0
# one hot encode outputs
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
num_classes = y_test.shape[1]
# Create the model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), padding='same', activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', kernel_constraint=MaxNorm(3)))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
epochs = 25
lrate = 0.01
decay = lrate/epochs
sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.summary()
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=32)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

# Simple CNN model for the CIFAR-10 Dataset

from tensorflow.keras.datasets import cifar10

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

from tensorflow.keras.layers import Dropout

from tensorflow.keras.layers import Flatten

from tensorflow.keras.constraints import MaxNorm

from tensorflow.keras.optimizers import SGD

from tensorflow.keras.layers import Conv2D

from tensorflow.keras.layers import MaxPooling2D

from tensorflow.keras.utils import to_categorical

# load data

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# normalize inputs from 0-255 to 0.0-1.0

X_train = X_train.astype('float32')

X_test = X_test.astype('float32')

X_train = X_train / 255.0

X_test = X_test / 255.0

# one hot encode outputs

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)

num_classes = y_test.shape[1]

# Create the model

model = Sequential()

model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), padding='same', activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.2))

model.add(Conv2D(32, (3, 3), activation='relu', padding='same', kernel_constraint=MaxNorm(3)))

model.add(MaxPooling2D())

model.add(Flatten())

model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.5))

model.add(Dense(num_classes, activation='softmax'))

# Compile model

epochs = 25

lrate = 0.01

decay = lrate/epochs

sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)

model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

model.summary()

# Fit the model

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=32)

# Final evaluation of the model

scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

Running this example provides the results below. First, the network structure is summarized, which confirms the design was implemented correctly.

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 conv2d (Conv2D)             (None, 32, 32, 32)        896

 dropout (Dropout)           (None, 32, 32, 32)        0

 conv2d_1 (Conv2D)           (None, 32, 32, 32)        9248

 max_pooling2d (MaxPooling2D  (None, 16, 16, 32)       0
 )

 flatten (Flatten)           (None, 8192)              0

 dense (Dense)               (None, 512)               4194816

 dropout_1 (Dropout)         (None, 512)               0

 dense_1 (Dense)             (None, 10)                5130

=================================================================
Total params: 4,210,090
Trainable params: 4,210,090
Non-trainable params: 0
_________________________________________________________________

Model: "sequential"

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

conv2d (Conv2D) (None, 32, 32, 32) 896

dropout (Dropout) (None, 32, 32, 32) 0

conv2d_1 (Conv2D) (None, 32, 32, 32) 9248

max_pooling2d (MaxPooling2D (None, 16, 16, 32) 0

)

flatten (Flatten) (None, 8192) 0

dense (Dense) (None, 512) 4194816

dropout_1 (Dropout) (None, 512) 0

dense_1 (Dense) (None, 10) 5130

=================================================================

Total params: 4,210,090

Trainable params: 4,210,090

Non-trainable params: 0

_________________________________________________________________

The classification accuracy and loss are printed after each epoch on both the training and test datasets.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

The model is evaluated on the test set and achieves an accuracy of 70.5%, which is not excellent.

...
Epoch 20/25
1563/1563 [==============================] - 34s 22ms/step - loss: 0.3001 - accuracy: 0.8944 - val_loss: 1.0160 - val_accuracy: 0.6984
Epoch 21/25
1563/1563 [==============================] - 35s 23ms/step - loss: 0.2783 - accuracy: 0.9021 - val_loss: 1.0339 - val_accuracy: 0.6980
Epoch 22/25
1563/1563 [==============================] - 35s 22ms/step - loss: 0.2623 - accuracy: 0.9084 - val_loss: 1.0271 - val_accuracy: 0.7014
Epoch 23/25
1563/1563 [==============================] - 33s 21ms/step - loss: 0.2536 - accuracy: 0.9104 - val_loss: 1.0441 - val_accuracy: 0.7011
Epoch 24/25
1563/1563 [==============================] - 34s 22ms/step - loss: 0.2383 - accuracy: 0.9180 - val_loss: 1.0576 - val_accuracy: 0.7012
Epoch 25/25
1563/1563 [==============================] - 37s 24ms/step - loss: 0.2245 - accuracy: 0.9219 - val_loss: 1.0544 - val_accuracy: 0.7050
Accuracy: 70.50%

...

Epoch 20/25

1563/1563 [==============================] - 34s 22ms/step - loss: 0.3001 - accuracy: 0.8944 - val_loss: 1.0160 - val_accuracy: 0.6984

Epoch 21/25

1563/1563 [==============================] - 35s 23ms/step - loss: 0.2783 - accuracy: 0.9021 - val_loss: 1.0339 - val_accuracy: 0.6980

Epoch 22/25

1563/1563 [==============================] - 35s 22ms/step - loss: 0.2623 - accuracy: 0.9084 - val_loss: 1.0271 - val_accuracy: 0.7014

Epoch 23/25

1563/1563 [==============================] - 33s 21ms/step - loss: 0.2536 - accuracy: 0.9104 - val_loss: 1.0441 - val_accuracy: 0.7011

Epoch 24/25

1563/1563 [==============================] - 34s 22ms/step - loss: 0.2383 - accuracy: 0.9180 - val_loss: 1.0576 - val_accuracy: 0.7012

Epoch 25/25

1563/1563 [==============================] - 37s 24ms/step - loss: 0.2245 - accuracy: 0.9219 - val_loss: 1.0544 - val_accuracy: 0.7050

Accuracy: 70.50%

You can improve the accuracy significantly by creating a much deeper network. This is what you will look at in the next section.

Larger Convolutional Neural Network for CIFAR-10

You have seen that a simple CNN performs poorly on this complex problem. In this section, you will look at scaling up the size and complexity of your model.

Let’s design a deep version of the simple CNN above. You can introduce an additional round of convolutions with many more feature maps. You will use the same pattern of Convolutional, Dropout, Convolutional, and Max Pooling layers.

This pattern will be repeated three times with 32, 64, and 128 feature maps. The effect is an increasing number of feature maps with a smaller and smaller size given the max pooling layers. Finally, an additional and larger Dense layer will be used at the output end of the network in an attempt to better translate the large number of feature maps to class values.

A summary of the new network architecture is as follows:

Convolutional input layer, 32 feature maps with a size of 3×3, and a rectifier activation function
Dropout layer at 20%
Convolutional layer, 32 feature maps with a size of 3×3, and a rectifier activation function
Max Pool layer with size 2×2
Convolutional layer, 64 feature maps with a size of 3×3, and a rectifier activation function
Dropout layer at 20%.
Convolutional layer, 64 feature maps with a size of 3×3, and a rectifier activation function
Max Pool layer with size 2×2
Convolutional layer, 128 feature maps with a size of 3×3, and a rectifier activation function
Dropout layer at 20%
Convolutional layer,128 feature maps with a size of 3×3, and a rectifier activation function
Max Pool layer with size 2×2
Flatten layer
Dropout layer at 20%
Fully connected layer with 1024 units and a rectifier activation function
Dropout layer at 20%
Fully connected layer with 512 units and a rectifier activation function
Dropout layer at 20%
Fully connected output layer with 10 units and a softmax activation function

You can very easily define this network topology in Keras as follows:

...
# Create the model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(1024, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
epochs = 25
lrate = 0.01
decay = lrate/epochs
sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.summary()
...

...

# Create the model

model = Sequential()

model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), activation='relu', padding='same'))

model.add(Dropout(0.2))

model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))

model.add(MaxPooling2D())

model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))

model.add(Dropout(0.2))

model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))

model.add(MaxPooling2D())

model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))

model.add(Dropout(0.2))

model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))

model.add(MaxPooling2D())

model.add(Flatten())

model.add(Dropout(0.2))

model.add(Dense(1024, activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.2))

model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.2))

model.add(Dense(num_classes, activation='softmax'))

# Compile model

epochs = 25

lrate = 0.01

decay = lrate/epochs

sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)

model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

model.summary()

...

You can fit and evaluate this model using the same procedure from above and the same number of epochs but a larger batch size of 64, found through some minor experimentation.

...
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

...

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=64)

# Final evaluation of the model

scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

Tying this all together, the complete example is listed below.

# Large CNN model for the CIFAR-10 Dataset
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.constraints import MaxNorm
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.utils import to_categorical
# load data
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
# normalize inputs from 0-255 to 0.0-1.0
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train = X_train / 255.0
X_test = X_test / 255.0
# one hot encode outputs
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
num_classes = y_test.shape[1]
# Create the model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(1024, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
epochs = 25
lrate = 0.01
decay = lrate/epochs
sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.summary()
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

# Large CNN model for the CIFAR-10 Dataset

from tensorflow.keras.datasets import cifar10

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

from tensorflow.keras.layers import Dropout

from tensorflow.keras.layers import Flatten

from tensorflow.keras.constraints import MaxNorm

from tensorflow.keras.optimizers import SGD

from tensorflow.keras.layers import Conv2D

from tensorflow.keras.layers import MaxPooling2D

from tensorflow.keras.utils import to_categorical

# load data

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# normalize inputs from 0-255 to 0.0-1.0

X_train = X_train.astype('float32')

X_test = X_test.astype('float32')

X_train = X_train / 255.0

X_test = X_test / 255.0

# one hot encode outputs

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)

num_classes = y_test.shape[1]

# Create the model

model = Sequential()

model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), activation='relu', padding='same'))

model.add(Dropout(0.2))

model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))

model.add(MaxPooling2D())

model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))

model.add(Dropout(0.2))

model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))

model.add(MaxPooling2D())

model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))

model.add(Dropout(0.2))

model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))

model.add(MaxPooling2D())

model.add(Flatten())

model.add(Dropout(0.2))

model.add(Dense(1024, activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.2))

model.add(Dense(512, activation='relu', kernel_constraint=MaxNorm(3)))

model.add(Dropout(0.2))

model.add(Dense(num_classes, activation='softmax'))

# Compile model

epochs = 25

lrate = 0.01

decay = lrate/epochs

sgd = SGD(learning_rate=lrate, momentum=0.9, decay=decay, nesterov=False)

model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

model.summary()

# Fit the model

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=64)

# Final evaluation of the model

scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

Running this example prints the classification accuracy and loss on the training and test datasets for each epoch.

The estimate of classification accuracy for the final model is 79.5% which is nine points better than our simpler model.

...
Epoch 20/25
782/782 [==============================] - 50s 64ms/step - loss: 0.4949 - accuracy: 0.8237 - val_loss: 0.6161 - val_accuracy: 0.7864
Epoch 21/25
782/782 [==============================] - 51s 65ms/step - loss: 0.4794 - accuracy: 0.8308 - val_loss: 0.6184 - val_accuracy: 0.7866
Epoch 22/25
782/782 [==============================] - 50s 64ms/step - loss: 0.4660 - accuracy: 0.8347 - val_loss: 0.6158 - val_accuracy: 0.7901
Epoch 23/25
782/782 [==============================] - 50s 64ms/step - loss: 0.4523 - accuracy: 0.8395 - val_loss: 0.6112 - val_accuracy: 0.7919
Epoch 24/25
782/782 [==============================] - 50s 64ms/step - loss: 0.4344 - accuracy: 0.8454 - val_loss: 0.6080 - val_accuracy: 0.7886
Epoch 25/25
782/782 [==============================] - 50s 64ms/step - loss: 0.4231 - accuracy: 0.8487 - val_loss: 0.6076 - val_accuracy: 0.7950
Accuracy: 79.50%

...

Epoch 20/25

782/782 [==============================] - 50s 64ms/step - loss: 0.4949 - accuracy: 0.8237 - val_loss: 0.6161 - val_accuracy: 0.7864

Epoch 21/25

782/782 [==============================] - 51s 65ms/step - loss: 0.4794 - accuracy: 0.8308 - val_loss: 0.6184 - val_accuracy: 0.7866

Epoch 22/25

782/782 [==============================] - 50s 64ms/step - loss: 0.4660 - accuracy: 0.8347 - val_loss: 0.6158 - val_accuracy: 0.7901

Epoch 23/25

782/782 [==============================] - 50s 64ms/step - loss: 0.4523 - accuracy: 0.8395 - val_loss: 0.6112 - val_accuracy: 0.7919

Epoch 24/25

782/782 [==============================] - 50s 64ms/step - loss: 0.4344 - accuracy: 0.8454 - val_loss: 0.6080 - val_accuracy: 0.7886

Epoch 25/25

782/782 [==============================] - 50s 64ms/step - loss: 0.4231 - accuracy: 0.8487 - val_loss: 0.6076 - val_accuracy: 0.7950

Accuracy: 79.50%

Extensions to Improve Model Performance

You have achieved good results on this very difficult problem, but you are still a good way from achieving world-class results.

Below are some ideas that you can try to extend upon the models and improve model performance.

Train for More Epochs. Each model was trained for a very small number of epochs, 25. It is common to train large convolutional neural networks for hundreds or thousands of epochs. You should expect performance gains can be achieved by significantly raising the number of training epochs.
Image Data Augmentation. The objects in the image vary in their position. Another boost in model performance can likely be achieved by using some data augmentation. Methods such as standardization, random shifts, or horizontal image flips may be beneficial.
Deeper Network Topology. The larger network presented is deep, but larger networks could be designed for the problem. This may involve more feature maps closer to the input and perhaps less aggressive pooling. Additionally, standard convolutional network topologies that have been shown useful may be adopted and evaluated on the problem.

Summary

In this post, you discovered how to create deep learning models in Keras for object recognition in photographs.

After working through this tutorial, you learned:

About the CIFAR-10 dataset and how to load it in Keras and plot ad hoc examples from the dataset
How to train and evaluate a simple Convolutional Neural Network on the problem
How to expand a simple Convolutional Neural Network into a deep Convolutional Neural Network in order to boost performance on the difficult problem
How to use data augmentation to get a further boost on the difficult object recognition problem

Do you have any questions about object recognition or this post? Ask your question in the comments, and I will do my best to answer.

188 Responses to Object Classification with CNNs Using the Keras Deep Learning Library

Aakash Nain July 24, 2016 at 9:53 pm #

Hello Jason,
What is the use of maxnorm in context of deep learning ?

Reply
- Jason Brownlee July 25, 2016 at 7:59 am #
  
  Hi Aakash,
  
  maxnorm is a weight constraint and was found to be useful when using dropout. Learn more about maxnorm and dropout here:
  https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/
  
  You can learn more about weight constraints in Keras here:
  http://keras.io/constraints/
  
  Reply
Aqsa July 31, 2016 at 4:46 pm #

Hi Jason
I am doing detection of road signs in real time . The size of my images is 800*1360. The size of the road sign varies from 16*16 to 256*256. How can I use convolutional neural network for this puppose to get good detection accuracy in real time

Reply
- Jason Brownlee August 1, 2016 at 6:28 am #
  
  Consider how you frame the problem Aqsa. Two options are:
  
  1) You could rescale all images to the same size.
  2) You could zero-pad all images.
  
  As for the specifics of the network for this problem, you will have to design and test different structures. Perhaps you can leverage an existing well performing structure like VGG or inception.
  
  Reply
Jack September 1, 2016 at 12:54 pm #

What should be the input dimensions for 3D dataset of pcd format or off format.?
BdwI I find your tutorials very helpful.:-)

Reply
- Jason Brownlee September 2, 2016 at 8:04 am #
  
  Sorry, I don’t know what those formats are Jack.
  
  Reply
  - Jack September 2, 2016 at 2:48 pm #
    
    Point cloud data(PCD) contains the x,y and z coordinates of the object….. I want to build a neural network for 3D object classification… The problem I am facing is I don’t know what shd be the input to my network… For a neural network that classifies images you pass the pixel values (0-255), but a pcd file just has the coordinates…Is it wise to pass the coordinates as the inputs?..
    I can extract some features of the object ( from pcd file)… Can I pass those features as input ??
    I am new to this field, so I having difficulty understanding things…
    
    Reply
    - Jason Brownlee September 3, 2016 at 6:55 am #
      
      I wonder if you can rescale the coordinates to all have the range 0-to-1. Then provide them directly.
      
      From there, you will have a baseline and can start to explore other transforms of your coords, such as perhaps projections into 2D.
      
      Reply
    - Sanketh February 18, 2018 at 4:41 am #
      
      Did you figure out a way to do that. I am currently facing a similar situation. Can you tell me how you went on to solve that problem
      
      Reply
shudhan September 2, 2016 at 4:49 pm #

May i know how to extract features from the images?

Reply
- Jason Brownlee September 3, 2016 at 6:58 am #
  
  We no longer need to extract features when using deep learning methods as we are performing automatic feature learning. A great benefit of the approach.
  
  Reply
Walid Ahmed September 10, 2016 at 4:52 am #

Thanks a lot.

I have one question

In Keras, How can I extract the exact location of the detected object (or objects) within image that includes a background?
I assume it uses sliding window for object detection

Reply
- Jason Brownlee September 10, 2016 at 7:12 am #
  
  Great question Walid.
  
  This is called object identification in an image. I do not have an example at the moment, but I will prepare one in the future.
  
  Reply
Vinay September 12, 2016 at 5:04 am #

Hi…could you please give same example for prima diabetes or airline passenger data set. My question is to apply CNN for direct numeric features. You could give any simple example

Reply
- Jason Brownlee September 12, 2016 at 8:35 am #
  
  Sorry, I don’t have such an example.
  
  Reply
- Punit Singh June 9, 2017 at 3:58 pm #
  
  Here you can learn about the Pima Diabetes example developed by Jason. https://machinelearningmastery.com/tutorial-first-neural-network-python-keras/
  
  After studying from this example, I have implemented this example in iPython notebook, you can see here. https://github.com/thepunitsingh/keras_tutorials/blob/master/pima-indians-diabetes-categorization.ipynb
  
  Reply
Walid Ahmed September 14, 2016 at 3:22 am #

Thanks Jason, I will look forward to your example.

Reply
NotMikeJones September 29, 2016 at 11:31 am #

Where in Keras are you specifying the input dimension for your first convolution layer? I’d like to try a convolution NN with time series for event detection, and am having issues with keras 1d convolution working. Let’s say each of my samples are a timeseries represented by a 1 x 100 vector, and within the vector, I expect three types of events to occur somewhere in those time frames (unclear what the length of the event would be, but lets say roughly 10 time points across). Would I use three feature maps, and then use a ‘window’ of, say 10 time points, that map onto a single neuron in the convolution later? So I would have a convolution layer of 3 x 10?

Thanks!

Reply
- Jason Brownlee September 30, 2016 at 7:47 am #
  
  LSTM is the network for dealing with sequences rather than CNN.
  
  CNN is good at spatial structure such as in images or text.
  
  This tutorial on time series with LSTMs might be what you’re looking for:
  https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/
  
  Reply
  - NotMikeJones October 1, 2016 at 12:12 am #
    
    Thanks for that! I actually think I misspoke – I don’t want to forecast values at future time points, but instead identify if a current set of timepoints fit a pattern that indicate a certain event.
    
    For example, let’s say we have fitness tracker data with various types of sensors (heart rate, pedometer, accelerometer), and we know a person does three types of activity: yoga, running, cooking. I want to train a model to identify these activities based on sensor data, and then be able to pull real-time data and classify what they are currently doing.
    
    I was thinking a CNN with a windowed-approach would be the best bet, but I might be completely off-base.
    
    Reply
    - Jason Brownlee October 1, 2016 at 8:02 am #
      
      It does sound like an anomaly detection or change detection problem, you may have benefit in framing the problem this way.
      
      Reply
Rafi October 18, 2016 at 6:34 pm #

Hi Jason,

I’m running the exact same code in this page which produced a 71.82% accuracy on test data.

only difference is I’m using a validation data set split by 70-30%. I’m getting only less than 11% validation and test accuracy. What could be my mistake? Have you faced such
results? please help.

Thanks
Rafi

35000/35000 [==============================] – 94s – loss: 2.2974 – acc: 0.1158 – val_loss: 2.3033 – val_acc: 0.0991
Epoch 24/25
35000/35000 [==============================] – 94s – loss: 2.2961 – acc: 0.1170 – val_loss: 2.3035 – val_acc: 0.1022
Epoch 25/25
35000/35000 [==============================] – 93s – loss: 2.2954 – acc: 0.1213 – val_loss: 2.3036 – val_acc: 0.0987

Reply
Walid Ahmed November 2, 2016 at 2:30 am #

Dear Jason

I appreciate if you can illustrate In Keras : How can I extract the exact location of the detected object (or objects) within image that includes a background?

Reply
- Jason Brownlee November 2, 2016 at 9:08 am #
  
  Great question Walid,
  
  Sorry, I don’t have an example of object localization with Keras yet. It is on the TODO list though.
  
  Reply
Augusto Aguirre November 2, 2016 at 2:28 pm #

Hello! Very good explanation!
I am using your model to classify images containing either 0 or 1.
To resolve this, I am able to pleanteas using initially with a variant in the last hidden layer:

model.add (Dense (1, activation = ‘sigmoid’))

For me return a result between 0 or 1.

When I want to adjust the model gives me the following error:
‘Error when checking input model: convolution2d_input_20 expected to have 4 dimensions, but got array With shape (8000, 3072)’
My dataset are 32×32 RGB images. Therefore, contains 3072 columns, but one with a 0 or 1.
Regards!

Reply
- Jason Brownlee November 3, 2016 at 7:50 am #
  
  Hi Augusto, sorry to hear about the error on your own data.
  
  It is not clear what the cause could be, sorry. Perhaps you are able to experiment and discover the root cause. Try simplifying your example to the minimum required and see if that helps to flush it out.
  
  Reply
Walid Ahmed November 11, 2016 at 12:49 am #

Why would I need to apply a dropout layer before a convolutional layer?

It just make sense for me when applied to input layer or any other layers in the fully connected layers.
Request.

Reply
- Jason Brownlee November 11, 2016 at 10:03 am #
  
  Hi Walid,
  
  It’s all about putting pressure on the network to force it to generalize. It may or may not be a good pressure point to force this type of learning. Try and see on your problem.
  
  Reply
William Amador November 11, 2016 at 3:51 am #

Hi,Jason I have a question, how is the procedure for training if in my initial layer I do not handle a single image but a sequence of 30 images. For a specific case a sequence of movements of a person; Do you have any examples of training for a CNN for this case

Thank you

Reply
- Jason Brownlee November 11, 2016 at 10:05 am #
  
  Hi William, great question. Sorry, I don’t have any worked examples of working with sequences of images at the moment.
  
  Reply
Walid Ahmed November 15, 2016 at 2:47 am #

Hi Jason

When I removed the dropout layers before any convolutional layer, my results improved.
especially when the size of dataset is large.

Reply
- Jason Brownlee November 15, 2016 at 7:57 am #
  
  Thanks for the feedback Walid. Nice work!
  
  Reply
Walid Ahmed November 16, 2016 at 6:06 am #

Hi Jason.
Another question , I know that keras comes with different optimizers, in your code you used sgd,others may use another optimizer like adam.
https://keras.io/optimizers/
any advice or recommendation about yupt of optimizer?

Reply
- Jason Brownlee November 16, 2016 at 9:34 am #
  
  Hi Walid,
  
  I find optimizers generally make minor differences to the results – move the needle less than the network topology.
  
  SGD is well understood and a great place to start. ADAM is fast and gives good results and I often use it in practice. I don’t really have much opinions beyond that.
  
  Reply
Bharath Paturi November 16, 2016 at 5:54 pm #

Hi Jason,

I have very large images to analyze. Probably each image size will be around 500 MB to 1 GB.
I wanted to apply segmentation to the images. Can we use convolution NN to do unsupervised learning.

Reply
- Jason Brownlee November 17, 2016 at 9:52 am #
  
  Ouch Bharath, they are massive images.
  
  It is possible, but you’re going to run out of memory really fast!
  
  I don’t have good advice, sorry. I have not researched this specific problem.
  
  Reply
Michael November 17, 2016 at 8:40 am #

Hi,

I saw the line where you add a layer has a typo. It should read

model.add(Convolution2D(32, 3, 3, input_shape=(32, 32, 3), border_mode=’same’, activation=’relu’,

Reply
- Jason Brownlee November 17, 2016 at 9:58 am #
  
  Are you sure Michael? It all looks good to me and the example runs with Theano and TensorFlow backends.
  
  Maybe I’m missing something?
  
  Reply
  - Sean July 4, 2017 at 8:35 am #
    
    I think Michael is right. It throws me an error with input_shape=(3,32,32). input_shape=(32,32,3) should be the correct one
    
    Reply
- Maxim December 19, 2016 at 1:44 am #
  
  Michael, for solving your problem put the line: K.set_image_dim_ordering(‘th’)
  
  ABOVE the
  
  (X_train, y_train), (X_test, y_test)= cifar10.load_data()
  
  Reply
Shristi Baral November 21, 2016 at 10:15 pm #

How do I deal with the error? Got this while running script under “Loading The CIFAR-10 Dataset in Keras”. I tried altering the script of cifar10.py to figure out what the error is. But I couldnot.
—————————————————————————
UnicodeDecodeError Traceback (most recent call last)
in ()
4 from scipy.misc import toimage
5 # load data
—-> 6 (X_train, y_train), (X_test, y_test) = cifar10.load_data()
7 # create a grid of 3×3 images
8 for i in range(0, 9):

/home/kdc/anaconda3/lib/python3.5/site-packages/keras/datasets/cifar10.py in load_data()
18 for i in range(1, 6):
19 fpath = os.path.join(path, ‘data_batch_’ + str(i))
—> 20 data, labels = load_batch(fpath)
21 X_train[(i-1)*10000:i*10000, :, :, :] = data
22 y_train[(i-1)*10000:i*10000] = labels

/home/kdc/anaconda3/lib/python3.5/site-packages/keras/datasets/cifar.py in load_batch(fpath, label_key)
10 d = cPickle.load(f)
11 else:
—> 12 d = cPickle.load(f, encoding=”bytes”)
13 # decode utf8
14 for k, v in d.items():

UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0x80 in position 3031: ordinal not in range(128)

Reply
- Jason Brownlee November 22, 2016 at 7:05 am #
  
  I have not seen this error before, perhaps post to stack overflow or the Keras list?
  
  Reply
Aquib Javed Khan November 22, 2016 at 8:35 pm #

Hi, Thanks for the awesome tutorial, you’r codes and explanations helps me lot in understanding the classification task.

I actually want to feed new images want to get the return the label matches to it, How can I do it like I am doing like this:

import keras
from keras.models import load_model
from keras.models import Sequential
import cv2
import numpy as np
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
model = Sequential()

model =load_model(‘firstmodel.h5′)
model.compile(loss=’binary_crossentropy’,
optimizer=’rmsprop’,
metrics=[‘accuracy’])

img = cv2.imread(‘cat.jpg’,0)
img = cv2.resize(img,(150,150))

classes = model.predict_classes(img, batch_size=32)
print classes

I’m getting error:
Exception: Error when checking : expected convolution2d_input_1 to have 4 dimensions, but got array with shape (150, 150)

How to fix it?

Reply
- Jason Brownlee November 23, 2016 at 8:58 am #
  
  My suggestion would be to ensure that your loaded data matches the expected input dimensions exactly. You may have to resize or pad new data to make it match.
  
  Reply
s December 1, 2016 at 1:23 pm #

This is NOT objection detection, it’s classification

Reply
- Jason Brownlee December 2, 2016 at 8:11 am #
  
  Thanks for the feedback.
  
  Reply
Walid December 2, 2016 at 4:40 am #

Dear Jason

I am so eager for an example of object localization with Keras yet, I hope you can come with one soon

Reply
- Jason Brownlee December 2, 2016 at 8:18 am #
  
  I would like to prepare one soon Walid, hopefully in the new year.
  
  Reply
Ahmed Desoky December 7, 2016 at 4:55 am #

Hello Jason,

Your tutorials are so helpful and awesome.

I am working on a classification problem using Keras on kitti dataset. I found that kitti is not supported yet in https://keras.io/datasets/ .

Is there any advice or an introductory point to work out with my problem ?

Thanks for help

Reply
- Jason Brownlee December 7, 2016 at 8:56 am #
  
  Sorry, I have not looked at that dataset yet.
  
  Reply
Thanawin December 7, 2016 at 1:31 pm #

HI Jason

I have seen many tutorials, but this is the best.
I am wondering if “scores = model.evaluate(X_test, y_test, verbose=0)” will take the last model from “model.fit()”. I would really appreciate if you can suggest me how can I evaluate my test data using the best model from “model.fit()”

Reply
TSchecker December 14, 2016 at 7:35 pm #

Hi Jason,

just in case facing the error

“AttributeError: ‘module’ object has no attribute ‘control_flow_ops'”

found here:
https://github.com/fchollet/keras/issues/3857

import tensorflow as tf
tf.python.control_flow_ops = tf

Reply
- Jason Brownlee December 15, 2016 at 8:28 am #
  
  Thanks for sharing TSchecker.
  
  Reply
yask December 15, 2016 at 3:04 am #

Hi Jason
I really like your article
I would like to extract objects such as ice, water etc. from photograph.
is this library useful for that? How can I define training areas? Do I need to provide sample image of ice, water as a training area to the classifier? Which classifier servers best here? how about Artificial Neural Network ?

Reply
pranoy December 19, 2016 at 7:58 pm #

which function is used for predicting..

if i give an image of a horse and i want to predict the output. which function i have to use

Reply
- Jason Brownlee December 20, 2016 at 7:30 am #
  
  You can use model.predict() to make a prediction on new data.
  
  The new data must have the same shape as the data used to train the network.
  
  Reply
  - Philip L. December 3, 2019 at 8:30 am #
    
    Hello, I’m a pure newbie to all this but I’m learning all thanks to your amazing tutorials. But I still don’t understand how to use model.predict() especially on this example. Do you have an example of this project with model.predict()?? I really don’t have any idea yet how am I able to code that will load my own dataset to be predicted, rescale it to 32×32 dimension (is this the size of data?? not sure though) and then use that image as X_test. there’s so much I don’t know yet..
    Any response will be of so much significance to me.. Thank you
    
    Reply
    - Jason Brownlee December 3, 2019 at 1:32 pm #
      
      Thanks!
      
      You can predict for a single image by calling model.predict(), it will take an array with the shape [samples, rows, cols, channels], where you probably have 1 sample, 32×32 rows and cols and 3 channels.
      
      Does that help?
      
      Reply
Milos December 29, 2016 at 10:34 pm #

Hi Jason,

great article. I have a question. In Dense part you have specified 512 neurons. Can you tell me how did you determine number of neurons?

tnx

Reply
- Jason Brownlee December 30, 2016 at 5:50 am #
  
  Hi Milos, I used trial and error. Selecting the number and size of layer is an art – test a lot of configs.
  
  Reply
  - Milos December 30, 2016 at 7:07 pm #
    
    Thanks for answer. I have assumed so , but I had to ask :).
    Great and really useful articles I have found on your site :).
    
    Reply
    - Jason Brownlee December 31, 2016 at 7:02 am #
      
      Thanks Milos.
      
      Reply
Deepak January 30, 2017 at 3:56 pm #

When I use the model.predict, the following error is seen. Please help me

TypeError Traceback (most recent call last)
in ()
—-> 1 model.predict(tmp)

/usr/local/lib/python2.7/dist-packages/keras/models.pyc in predict(self, x, batch_size, verbose)
722 if self.model is None:
723 self.build()
–> 724 return self.model.predict(x, batch_size=batch_size, verbose=verbose)
725
726 def predict_on_batch(self, x):

/usr/local/lib/python2.7/dist-packages/keras/engine/training.pyc in predict(self, x, batch_size, verbose)
1266 f = self.predict_function
1267 return self._predict_loop(f, ins,
-> 1268 batch_size=batch_size, verbose=verbose)
1269
1270 def train_on_batch(self, x, y,

/usr/local/lib/python2.7/dist-packages/keras/engine/training.pyc in _predict_loop(self, f, ins, batch_size, verbose)
944 ins_batch = slice_X(ins, batch_ids)
945
–> 946 batch_outs = f(ins_batch)
947 if not isinstance(batch_outs, list):
948 batch_outs = [batch_outs]

/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.pyc in __call__(self, inputs)
957 def __call__(self, inputs):
958 assert isinstance(inputs, (list, tuple))
–> 959 return self.function(*inputs)
960
961

/usr/local/lib/python2.7/dist-packages/Theano-0.9.0.dev5-py2.7.egg/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
786 s.storage[0] = s.type.filter(
787 arg, strict=s.strict,
–> 788 allow_downcast=s.allow_downcast)
789
790 except Exception as e:

/usr/local/lib/python2.7/dist-packages/Theano-0.9.0.dev5-py2.7.egg/theano/tensor/type.pyc in filter(self, data, strict, allow_downcast)
115 if allow_downcast:
116 # Convert to self.dtype, regardless of the type of data
–> 117 data = theano._asarray(data, dtype=self.dtype)
118 # TODO: consider to pad shape with ones to make it consistent
119 # with self.broadcastable… like vector->row type thing

/usr/local/lib/python2.7/dist-packages/Theano-0.9.0.dev5-py2.7.egg/theano/misc/safe_asarray.pyc in _asarray(a, dtype, order)
32 dtype = theano.config.floatX
33 dtype = numpy.dtype(dtype) # Convert into dtype object.
—> 34 rval = numpy.asarray(a, dtype=dtype, order=order)
35 # Note that dtype comparison must be done by comparing their num
36 # attribute. One cannot assume that two identical data types are pointers

/home/yashwanth/.local/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
529
530 “””
–> 531 return array(a, dtype, copy=False, order=order)
532
533

TypeError: Bad input argument to theano function with name “/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py:955” at index 0 (0-based).
Backtrace when that variable is created:

File “/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py”, line 2821, in run_ast_nodes
if self.run_code(code, result):
File “/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py”, line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “”, line 2, in
model.add(Convolution1D(64, 2, input_shape=[1,4], border_mode=’same’, activation=’relu’, W_constraint=maxnorm(3)))
File “/usr/local/lib/python2.7/dist-packages/keras/models.py”, line 299, in add
layer.create_input_layer(batch_input_shape, input_dtype)
File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 397, in create_input_layer
dtype=input_dtype, name=name)
File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 1198, in Input
input_tensor=tensor)
File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 1116, in __init__
name=self.name)
File “/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py”, line 110, in placeholder
x = T.TensorType(dtype, broadcast)(name)
float() argument must be a string or a number

Reply
- Jason Brownlee February 1, 2017 at 10:21 am #
  
  I’m sorry to hear that.
  
  The cause is not obvious to me, the stack trace is hard to read.
  
  Perhaps you could try posting to stackoverflow or the Keras google group?
  
  Reply
Rajesh February 6, 2017 at 4:08 am #

Hi Jason,

Any idea about the following error? Everything looked good till the model.summary point. But when I tried to fit the model, I am seeing the following error.

ValueError Traceback (most recent call last)
in ()
1 # Fit the model
—-> 2 model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=epochs, batch_size=32)
3 # Final evaluation of the model
4 scores = model.evaluate(X_test, y_test, verbose=0)
5 print(“Accuracy: %.2f%%” % (scores[1]*100))

/home/rajesh/anaconda2/lib/python2.7/site-packages/keras/models.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
670 class_weight=class_weight,
671 sample_weight=sample_weight,
–> 672 initial_epoch=initial_epoch)
673
674 def evaluate(self, x, y, batch_size=32, verbose=1,

/home/rajesh/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch)
1115 class_weight=class_weight,
1116 check_batch_axis=False,
-> 1117 batch_size=batch_size)
1118 # prepare validation data
1119 if validation_data:

/home/rajesh/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_axis, batch_size)
1028 self.internal_input_shapes,
1029 check_batch_axis=False,
-> 1030 exception_prefix=’model input’)
1031 y = standardize_input_data(y, self.output_names,
1032 output_shapes,

/home/rajesh/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
122 ‘ to have shape ‘ + str(shapes[i]) +
123 ‘ but got array with shape ‘ +
–> 124 str(array.shape))
125 return arrays
126

ValueError: Error when checking model input: expected convolution2d_input_6 to have shape (None, 3, 32, 32) but got array with shape (50000, 32, 32, 3)

Reply
- Rajesh February 6, 2017 at 4:22 am #
  
  I understood the issue. I was mistaking at the input_shape step.
  
  It is a very nice tutorial. very well written. Thanks
  
  Reply
  - Rajesh February 6, 2017 at 8:59 am #
    
    Sorry to spam..but I am still seeing the error 🙁
    
    Reply
    - Jason Brownlee February 6, 2017 at 9:45 am #
      
      Hi Rajesh, you may want to confirm that you are not missing any lines of code.
      
      Also, confirm your version of Keras, TensorFlow/Theano and Python.
      
      Reply
Sam February 6, 2017 at 9:11 pm #

Hello,
Thank you for your example. I tried to run your code. But our network server is blocked and the code could not get data from the below url.

http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

Can you let me know if there is another way to make training data set without using download wrapper method ?

Reply
- Sam February 6, 2017 at 9:37 pm #
  
  Oh, I found a solution through googling. Thanks anyway.
  
  Reply
  - Jason Brownlee February 7, 2017 at 10:14 am #
    
    Glad to hear it Sam.
    
    Reply
John February 7, 2017 at 2:10 pm #

Hello,

How I can recognize bike in video. It would be great if you can give example.

Reply
- Jason Brownlee February 8, 2017 at 9:32 am #
  
  Great question John, it is an area I’d like to cover in the future.
  
  Reply
Ikhsan March 17, 2017 at 2:22 pm #

Hi Jason,

Thanks for the tutorial. I have a question about the random.seed(seed). Why do we need to seed it first and where is the random number generator used in the rest of the code?

Thanks.

Reply
- Jason Brownlee March 18, 2017 at 7:46 am #
  
  You do not need to seed the random number generator. I was attempting to make the code 100% reproducible.
  
  Random numbers are used to seed neural networks and during training to shuffle training data.
  
  You can learn more about random numbers for machine learning here:
  https://machinelearningmastery.com/randomness-in-machine-learning/
  
  Reply
Ali April 21, 2017 at 5:43 pm #

Hi jason

Thanks for this great tutorial, I am trying to run the code but I got the following error any suggestion to fix

I am using keras with theona 2.7 back end

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(3, 32, 32), padding=’same’, activation=’relu’, kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation=’relu’, padding=’same’, kernel_constraint=maxnorm(3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation=’relu’, kernel_constraint=maxnorm(3)))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation=’softmax’))
# Compile model
epochs = 25
lrate = 0.01
decay = lrate/epochs
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss=’categorical_crossentropy’, optimizer=sgd, metrics=[‘accuracy’])
print(model.summary())

Traceback (most recent call last):

File “”, line 2, in
model.add(Conv2D(32, (3, 3), input_shape=(3, 32, 32), padding=’same’, activation=’relu’, kernel_constraint=maxnorm(3)))

TypeError: __init__() takes at least 4 arguments (4 given)

Reply
- Jason Brownlee April 22, 2017 at 9:24 am #
  
  I’m not sure Ali, I have not seen this error before.
  
  Perhaps confirm that you have copied the code exactly?
  
  Consider removing arguments to help zoom in on the cause of the fault.
  
  Reply
AIZEN May 1, 2017 at 12:12 am #

Hi, i already manage to detect object in the still images. What am i supposed to do to detect objects in a video input? i intend to use the same Keras CNN.

Reply
- Jason Brownlee May 1, 2017 at 5:57 am #
  
  You could process each frame of the video as an image with a CNN and use an LSTM to handle sequences of data from the CNN.
  
  Reply
Supriya May 2, 2017 at 2:24 am #

Hi, in this example we have created CNN model , but how to test it..

Reply
- Jason Brownlee May 2, 2017 at 6:02 am #
  
  You can use a train/test split or k-fold cross validation.
  
  Reply
Chao May 18, 2017 at 3:55 pm #

What does “kernel_constraint=maxnorm(3)” mean?

Thanks a lot!

Reply
- Jason Brownlee May 19, 2017 at 8:13 am #
  
  Great question. It is a weight constraint. See this post on why we want to use them:
  https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/
  
  Reply
Mohamed Mnete June 1, 2017 at 5:22 pm #

Hi, say I had an image I would like the model you just made to predict what it has on it. I save it in the same directory as the python file. How can I load this to put into he prediction function and how can I write the prediction function. I would also like to try and use my own images on your model. I was also stuck on the technique of changing the images to numpy.arrays. I would also like the output to be either -1 or 1 . How can I code this up. Please help….

Reply
- Jason Brownlee June 2, 2017 at 12:55 pm #
  
  This post will help as a start:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  Reply
Natthaphon June 8, 2017 at 1:25 pm #

So can I insert annotation in each images

Reply
- Jason Brownlee June 9, 2017 at 6:17 am #
  
  Sure.
  
  Reply
Daniel June 18, 2017 at 5:33 am #

Hi,

I’m trying to do an image classifier that determines if something should be given a specific hashtag or not. My problem is that the accuracy of the classifier after each epoch remains constant, and is essentially assigning the same class to all images. This makes no sense as the image classes are fairly distinct (#gym and #foraging). I basically copied the smaller CNN you used:

model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, 100, 100), activation=’relu’))
model.add(Dropout(0.2))
model.add(Convolution2D(32, 3, 3, activation=’relu’))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation=’relu’))
model.add(Dropout(0.5))
model.add(Dense(1, activation=’softmax’))

epochs = 10
lrate = 0.001
decay = lrate/epochs
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
print(‘compiling model’)
model.compile(loss=’binary_crossentropy’, optimizer=sgd, metrics=[‘accuracy’])

print(“fitting model”)
model.fit(X_training, Y_training, nb_epoch=epochs, batch_size=100)

But every time, I get these results:

fitting model
Train on 908 samples, validate on 908 samples
Epoch 1/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 2/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 3/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 4/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 5/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 6/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 7/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 8/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 9/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000
Epoch 10/10
908/908 [==============================] – 104s – loss: 7.9712 – acc: 0.5000

What am I doing wrong?

Reply
- Jason Brownlee June 18, 2017 at 6:33 am #
  
  I have a list of ideas you can try here:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply

Darlington Akogo June 28, 2017 at 10:42 am #

Hello Jason, thanks for your great tutorials! I need some help please, I’m working on an a medical image recognition(diagnostics) project based on your tutorials. Over here, you import the Cifar10 dataset provided via Keras with the load_data() function, but since am downloading the dataset from a diff source, that wouldn’t be possible. So from some research, I came across Keras’ flow_from_directory() function for Image data processing, which is amazing, I could just separate the images into folders and it’d consider them as classes. However, medical images are in the “DICOM” image format, and Keras image functions don’t seem to support it, so with further research, I came across pydicom module, for processing DICOM images in python, however, now, I can’t use flow_from_directory(), can you PLEASE offer some help as to how I can train my ConvNet model with DICOM images and be able to use it to classify(with predict() function) any new DICOM image?

Thanks in advanced.

Jason Brownlee June 29, 2017 at 6:28 am #

I don’t know about that format.

Perhaps you can covert the images?
Perhaps you can put together your own DICOM-comapitable flow from dir function?

Reply

shila mosammami December 10, 2017 at 9:21 am #

Dear Friend,
try this:

def load_itk(filename):
    # Reads the image using SimpleITK
    itkimage = sitk.ReadImage(filename)

    # Convert the image to a  numpy array first and then shuffle the dimensions to get axis in the order z,y,x
    ct_scan = sitk.GetArrayFromImage(itkimage)

    # Read the origin of the ct_scan, will be used to convert the coordinates from world to voxel and vice versa.
    origin = np.array(list(reversed(itkimage.GetOrigin())))

    # Read the spacing along each dimension
    spacing = np.array(list(reversed(itkimage.GetSpacing())))


    return ct_scan, origin, spacing

def load_itk(filename):

# Reads the image using SimpleITK

itkimage = sitk.ReadImage(filename)

# Convert the image to a numpy array first and then shuffle the dimensions to get axis in the order z,y,x

ct_scan = sitk.GetArrayFromImage(itkimage)

# Read the origin of the ct_scan, will be used to convert the coordinates from world to voxel and vice versa.

origin = np.array(list(reversed(itkimage.GetOrigin())))

# Read the spacing along each dimension

spacing = np.array(list(reversed(itkimage.GetSpacing())))

return ct_scan, origin, spacing

Bruce Wind June 29, 2017 at 11:30 am #

Hi, Jason, thanks for sharing. I test the code you provided, but my machine does not support CUDA, so it runs very slowly( half an hour per epoch). Since you have such a powerful computer, could you please show the results after hundreds or thousands epoches later? Thanks.

Reply
- Jason Brownlee June 30, 2017 at 8:07 am #
  
  I ran the example on AWS. I recommend you try the same:
  https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/
  
  Reply
  - Sean July 4, 2017 at 9:14 am #
    
    It really helps!
    
    Reply
Nunu July 23, 2017 at 10:00 pm #

Dear Jason,
Really it is a very nice tutorial :). If you will plot acc vs acc_val there is a gap between the graph of acc and the graph of acc_val does this mean an overfitting ?! and also what I ( correct me if iam wrong) the accuracy graph should become after a certain number of epochs asymptotic( that is the acuracy will not increase anymore) !!
Thanks in advance
Noura

Reply
- Jason Brownlee July 24, 2017 at 6:54 am #
  
  If acc is less than val_acc than it may mean that the model is underfitting and that perhaps a larger model or a model fit for longer would do better on the validation set.
  
  Reply
  - Nunu July 24, 2017 at 6:55 pm #
    
    yes it is true but also if the acc_val is more than the acc then there is overfitting and I noticed this in your both results above ! what could be the reason of the overfitting ? and what we can do to get rid form it.
    
    regards
    Nunu
    
    Reply
    - Jason Brownlee July 25, 2017 at 9:40 am #
      
      Perhaps train less, perhaps train a smaller model, perhaps add some regularization like dropout.
      
      I hope that helps as a start.
      
      Reply
Nunu July 25, 2017 at 7:40 pm #

Yes I added dropouts and I added one more fully connected layer and i guess it worked.

Thanks a lot Jason 🙂
Best regards,
Nunu

Reply
- Jason Brownlee July 26, 2017 at 7:51 am #
  
  Nice work!
  
  Reply
Luna August 27, 2017 at 6:46 pm #

Hi Jason,

Normally CNN for image classification will result in a vector contented probabilities of possible class. Is there a function in this library to extract the vector? Thank you!

Reply
- Jason Brownlee August 28, 2017 at 6:49 am #
  
  You can use a softmax activation function on the output layer to get probabilities for multiple classes or sigmoid activation for binary class probabilities.
  
  Reply
Walid Ahmed September 2, 2017 at 4:55 am #

Hello,
is there is a way to apply the keras.predict function in a vectorized manner so that it can handle more than one input(image) simultaneously?

Thanks

Reply
- Jason Brownlee September 2, 2017 at 6:17 am #
  
  Yes, I believe predict() can take a list of samples (X) and return a list of predictions (yhat).
  
  Reply
zoda September 5, 2017 at 3:32 am #

hello,

what can i do if i have one image which consistent of 6 objects in the image how can i detect and measure the size of the object in the image. it would be helpful if you gave me some tips
thank you .

Reply
- Jason Brownlee September 7, 2017 at 12:37 pm #
  
  You can use object localization.
  
  Sorry, I don’t have an example.
  
  Reply
Sunil A Patel September 24, 2017 at 4:10 pm #

Sir,

I want to detect hand in a image. my problem is the size of hand is tiny in a image and its vary image to image(my image size is 1920*1080),. An image contain hand, cup ,face, table, etc…..

My problem is
1. should we train CNN for various hand shape and fixed size for example 50X50 or else
2. how to find localization of hand.

Reply
- Jason Brownlee September 25, 2017 at 5:37 am #
  
  Sorry, I don’t have examples of object localization. I hope to develop examples in the future.
  
  Reply
vinay October 17, 2017 at 12:00 am #

How to run this code on GPU enabled machine?

Reply
- Jason Brownlee October 17, 2017 at 5:48 am #
  
  You will need to configure your TensorFlow or Theano library to use the GPU. I do not have instructions for this, sorry.
  
  Generally, I would recommend using a pre-configured system, such as Amazon for about $1/hour:
  https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/
  
  Reply
Miles October 18, 2017 at 1:28 am #

Hi Jason,

In the first example network, you use a maxnorm kernel constraint in all hidden layers (conv and fully connected). I understand the advantages of this, particularly when used in combination with dropout. I’m wondering if there was a reason you removed the kernel constraint from the conv. layers in the second iteration of your model? If so, would you mind explaining the motivation for doing so? Thanks!

Reply
- Jason Brownlee October 18, 2017 at 5:39 am #
  
  It was few years ago now, I don’t remember. Perhaps the model achieved better results without the constraint?
  
  Reply
Tin Tran October 18, 2017 at 6:44 pm #

Hi Jason,

When we completed training the model, how do we use that model with a new picture?

Reply
- Jason Brownlee October 19, 2017 at 5:35 am #
  
  Great question, see this post on how to finalize a model:
  https://machinelearningmastery.com/train-final-machine-learning-model/
  
  Once you have a final model, you can call it:
  yhat = model.predict(newX)
  
  Reply
Gabriele Minniti October 28, 2017 at 8:10 pm #

Hi Jason! My name is Gabriele, and I’m a young data scientis! I’m current working with the CNNs and i’ve few question for you.

1. I’ve seen that you are using images 32 x 32, my question is about which is the best shape for an high-resolution image of 1070 – 720? It’s necessary that the images are squared?

2. How significatly could be the improvement of the number of epoch in ‘accuracy metrics’? Maybe 100 epoch? 1000? Could them return an high perfomarce model?

Thanks a lot!
PS. Sorry for my “not perfect english” i hope that you understand my questions!
GM

Reply
- Jason Brownlee October 29, 2017 at 5:53 am #
  
  You can try different shaped images or normalize to one shape. Smaller images require smaller models which in turn train faster.
  
  This post will give you ideas on how to improve model performance for neural nets:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Kumar November 7, 2017 at 6:44 pm #

Dear Jason,

I am trying to identify multiple objects in an image and count the number of objects for each class in each image. Can you help me on this.

Reply
- Jason Brownlee November 8, 2017 at 9:21 am #
  
  Sounds like a great problem, but I don’t have material on this type of problem, sorry.
  
  Reply
Arun November 15, 2017 at 12:10 am #

hey

Traceback (most recent call last):

File “”, line 1, in
b=model.fit(X, Y,validation_split=0.33, epochs=150, batch_size=10,verbose=0)

File “/Users/arun/anaconda/lib/python3.6/site-packages/keras/models.py”, line 893, in fit
initial_epoch=initial_epoch)

File “/Users/arun/anaconda/lib/python3.6/site-packages/keras/engine/training.py”, line 1555, in fit
batch_size=batch_size)

File “/Users/arun/anaconda/lib/python3.6/site-packages/keras/engine/training.py”, line 1409, in _standardize_user_data
exception_prefix=’input’)

File “/Users/arun/anaconda/lib/python3.6/site-packages/keras/engine/training.py”, line 126, in _standardize_input_data
array = data[i]

UnboundLocalError: local variable ‘arrays’ referenced before assignment

I am getting this error when trying to fit the model. Looks like its an error in Keras.

Reply
- Jason Brownlee November 15, 2017 at 9:53 am #
  
  Perhaps double check you have the latest version of all libraries installed:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  Also confirm that you have copied all of the code in the example.
  
  Reply
Himmat Ram Bairwa November 16, 2017 at 6:37 pm #

hi
i have give my own .jpg image then i want to predict the class but i am getting wrong result.

img = cv2.imread(‘C:/Users/8himmat/Desktop/ML/data/train/cats/dog2.jpg’)
img = cv2.resize(img,(32,32))
img = numpy.array(img)
img = numpy.reshape(img,[1,3,32,32])
preds = model.predict_classes(img)

Reply
- Jason Brownlee November 17, 2017 at 9:23 am #
  
  Perhaps the model is not good enough for your example?
  
  Try more examples, try training the model with more examples like the example you tried, try augmenting during training, and so on.
  
  Reply
Sheila M December 10, 2017 at 9:38 am #

Dear Jason,
First thanks a million for the tutorial you have provided.
I am trying to work on liver segmentation with ConvNN, and my data set is Sliver07 which includes meta images. there are two folders named scan which includes the oroginal meta images of CTscan, 20 on total and the second folder named segment which includes 20 meta images of segmented livers. Each instance has 2 formats .mhd and .raw, they can be easily viewed by the itksnap. I can load data and get the numpy array but the numpy array i get from the output for the original images and segmented images are as below:
[[[-1007 -997 -1007 …, -973 -969 -1007]
[-1002 -1000 -1007 …, -979 -999 -1015]
[-1000 -993 -1003 …, -999 -1009 -1007]
…,
[ -894 -877 -883 …, -893 -895 -906]
[ -872 -878 -894 …, -896 -892 -894]
[ -873 -882 -893 …, -898 -909 -896]]

[[-1005 -999 -1006 …, -972 -964 -1008]
[-1008 -995 -1004 …, -966 -995 -1022]
[-1004 -991 -1001 …, -990 -1015 -1008]
…,
[ -905 -882 -897 …, -889 -878 -895]
[ -879 -884 -910 …, -885 -880 -891]
[ -874 -893 -907 …, -882 -887 -897]]

[[-1001 -1010 -1009 …, -991 -967 -1000]
[-1000 -1004 -1006 …, -977 -989 -1019]
[ -993 -993 -1001 …, -988 -1013 -1001]
…,
[ -902 -918 -911 …, -899 -888 -892]
[ -905 -909 -911 …, -899 -887 -888]
[ -909 -911 -908 …, -900 -901 -896]]

…,
[[-1001 -1002 -1002 …, -1007 -998 -1004]
[-1004 -1004 -1005 …, -1003 -1009 -1000]
[ -996 -1002 -1005 …, -996 -1000 -988]
…,
[ -888 -896 -897 …, -883 -888 -898]
[ -879 -875 -866 …, -886 -881 -894]
[ -878 -866 -867 …, -895 -891 -896]]

[[ -986 -990 -997 …, -1010 -1004 -1008]
[ -999 -994 -995 …, -1003 -1007 -1002]
[-1006 -1011 -1006 …, -1000 -996 -980]
…,
[ -887 -894 -900 …, -888 -893 -903]
[ -882 -885 -879 …, -891 -883 -892]
[ -880 -876 -876 …, -896 -884 -882]]

[[-1000 -1004 -1003 …, -1002 -1003 -997]
[-1007 -1002 -1004 …, -1002 -997 -998]
[-1013 -1011 -997 …, -998 -995 -993]
…,
[ -887 -894 -891 …, -886 -891 -900]
[ -892 -892 -885 …, -892 -885 -893]
[ -895 -893 -885 …, -902 -893 -889]]]
segmented numpy:
[[[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
…,
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]]

[[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
…,
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]]

[[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
…,
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]]

…,
[[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
…,
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]]

[[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
…,
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]]

[[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
…,
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]
[0 0 0 …, 0 0 0]]]
1) I have no idea how i can build a model for this.
2)The numbers seem to be so strange for me, why all negative in original and all zero in segmented ones?
3) I expected to have 20 array for each folder, which I cannot make sense with these results.
Best Regards

Reply
- Jason Brownlee December 11, 2017 at 5:20 am #
  
  Perhaps contact the owners of the dataset to ask for more information about it?
  
  Reply
Fabrizio January 18, 2018 at 11:29 pm #

Hi Jason,
I’m currently working on a real-time object detection. I have some constrains I need to keep in mind: this program should work under an ordinary PC desktop and the main idea is that it should be able to detect objects taking a video as input. So it shouldn’t be expensive when it comes to computational cost.
Is a CNN a good approach to solve this problem? Googling, I’ve seen a pretrained DNN on OpenCV (I should use OpenCV as much as possible) but it runs under Caffe and I have no idea about Caffe.
That’s why I am here and I think a CNN would be a nice approach. What do you think? Do you have some extra advices/tips?

Reply
- Jason Brownlee January 19, 2018 at 6:29 am #
  
  I think running a CNN is straight forward, training is very slow.
  
  Reply
ali February 3, 2018 at 6:37 am #

hi.i did not see any image identification!

Reply
Pratip March 11, 2018 at 12:35 am #

Hello Sir ,

Can you please tell me what is the error , i have just copied the above code but it displays some error .

ValueError: Error when checking input: expected conv2d_26_input to have shape (3, 32, 32) but got array with shape (32, 32, 3)

I have tried changing the input shape but error still exists .

Reply
- Jason Brownlee March 11, 2018 at 6:29 am #
  
  It looks like the model expects the data to be one size and the data is another. You can change the model or change the data.
  
  Reply
Satendra Ramesh Varma March 26, 2018 at 7:56 am #

It takes too long for CIFAR dataset to download (EST 4 HRS+). Is there a way we can load the data directly using the zip file from their website ?

Reply
- Satendra Ramesh Varma March 26, 2018 at 8:29 am #
  
  Never mind. I found the solution.
  Download the zip file from the CIFAR dataset website.
  Place the zip file in keras–>datasets folder.
  The zip file contains the batches folder.
  Now use 7zip on windows to compress the file first as tar and then as zip using Add to Archive option in right click menu
  And run the code again.
  Saves a lot of time and no code changes.
  
  Reply
  - Jason Brownlee March 26, 2018 at 10:05 am #
    
    Glad to hear it.
    
    Reply
- Jason Brownlee March 26, 2018 at 10:04 am #
  
  I’m not sure off hand, sorry.
  
  Reply
Meenakshi Choudhary April 7, 2018 at 4:15 pm #

Could You please share the code for fusion of multiple convolutional neural networks in python/R/Matlab.

Reply
- Jason Brownlee April 8, 2018 at 6:16 am #
  
  Sure, see here:
  https://machinelearningmastery.com/develop-n-gram-multichannel-convolutional-neural-network-sentiment-analysis/
  
  Reply
Satyajit Pattnaik April 9, 2018 at 3:15 pm #

Need a direction on how to proceed with Crack detection, I have certain images of some machines out of which some images have cracks on it, need to identify the images having cracks, how this problem can be solved?

Tried using Canny edge detection, but after finding the edges, a bit stuck how to proceed, if there are any other ways to achieve, please reply, i shall try working on it..

Reply
- Jason Brownlee April 10, 2018 at 6:13 am #
  
  Sounds like a great problem. Perhaps you can use transfer learning with a pre-trained model like the VGG to get started with model development?
  
  Reply
Raj April 12, 2018 at 11:14 pm #

Dear Jason,

I have some images of scatterplots (matplotlib) and idl file containing the coordinates for the bounding boxes of the tick marks, tick values and points, and I want to do the object detection task now can you please suggest me some pretrained models with which I can do object detection for the tick marks, tick values and points?

Reply
- Jason Brownlee April 13, 2018 at 6:41 am #
  
  That sounds like a great project and using a pre-trained model is a great way to accelerate progress.
  
  Perhaps start with one of these:
  https://keras.io/applications/
  
  Reply
  - Raj April 15, 2018 at 2:03 am #
    
    Dear Jason,
    Thank you for your reply . Actually I have one more question, in object detection task if I am giving the images and the (Xmin,Ymin,Xmax,Ymax) i.e. bounding boxes in the training of the model then what will I give in the testing of the model only the images ?
    
    Reply
    - Jason Brownlee April 15, 2018 at 6:35 am #
      
      I’m not sure I follow, what do you mean exactly?
      
      Reply
Raj April 15, 2018 at 8:26 am #

I mean if I have training and testing images and I have the bounding boxes for the training as well as testing images and I am giving the images and the bounding boxes as the labelled data to my model for training then what will I give during the testing of my model only the testing images or images with the bounding boxes (labelled data ) ?

Reply
Kartik May 10, 2018 at 4:23 pm #

Hi, I want to detect an object as small as 30×40 pixels in an image that is 6000×4000 pixels. Would this approach be correct ?

Reply
- Jason Brownlee May 11, 2018 at 6:32 am #
  
  Wow, it might be challenging.
  
  Try it and see.
  
  Reply
Ekaterina May 17, 2018 at 10:49 am #

Hi everyone, I have a problem with importing cifar10 dataset:

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

AttributeError Traceback (most recent call last)
in ()
3
4 # load the pre-shuffled train and test data
—-> 5 (x_train, y_train), (x_test, y_test) = cifar10.load_data()

~/anaconda3/lib/python3.6/site-packages/keras/datasets/cifar10.py in load_data()
18 for i in range(1, 6):
19 fpath = os.path.join(path, ‘data_batch_’ + str(i))
—> 20 data, labels = load_batch(fpath)
21 X_train[(i-1)*10000:i*10000, :, :, :] = data
22 y_train[(i-1)*10000:i*10000] = labels

~/anaconda3/lib/python3.6/site-packages/keras/datasets/cifar.py in load_batch(fpath, label_key)
14 for k, v in d.items():
15 del(d[k])
—> 16 d[k.decode(“utf8”)] = v
17 f.close()
18 data = d[“data”]

AttributeError: ‘str’ object has no attribute ‘decode’

Did anyone have this problem? Any solution?

Reply
- Ekaterina May 17, 2018 at 1:06 pm #
  
  Solved by reinstalling Keras
  
  Reply
  - Jason Brownlee May 17, 2018 at 3:12 pm #
    
    Glad to hear it.
    
    Reply
- Jason Brownlee May 17, 2018 at 3:11 pm #
  
  I’m sorry to hear that. Perhaps try posting your code and error to stackoverflow?
  
  Reply
Julia June 7, 2018 at 1:48 am #

Hi, thanks for this tutorial, it is very helpful! I used the same architecture as your deep CNN but it is taking over 10 minutes to do one epoch, but on yours it looks like it only took about 30 seconds. As far as I can tell, my code is the same as yours and my computer has never been this slow before. Any ideas on what might be causing this?

Reply
- Jason Brownlee June 7, 2018 at 6:32 am #
  
  Perhaps try training on EC2:
  https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/
  
  Reply
Anindya July 19, 2018 at 10:36 pm #

Hi Jason,

Great tutorial, very helpful for my project. 🙂
It will be very helpful if you can publish any post on real time object detection and localization, like YOLO or SSD. If you already have any post on that can you please share the link here. Thanks 🙂

Reply
- Jason Brownlee July 20, 2018 at 5:58 am #
  
  Thanks for the suggestion.
  
  Reply
Inder August 5, 2018 at 4:53 pm #

Jason, what if our testing image have other object which is not from this dataset. or simply plain image without class object.
This will still classify and match something from our class set, how we can handle this problem.
One solution i found is create a class which have only negative images (not related to existing classes), then in this world there are two many different negative images which is hard to collect. please explain how we can make this more perfect.

Reply
- Jason Brownlee August 6, 2018 at 6:25 am #
  
  Try it and see.
  
  Reply
Vishwas August 7, 2018 at 9:02 pm #

Hi Jason,

How do we decide on the number of Convolutional layers for an image,Is it Trial-error method or is there a rule behind it

Reply
- Jason Brownlee August 8, 2018 at 6:18 am #
  
  Yes, trial and error. I explain more here:
  https://machinelearningmastery.com/how-to-configure-the-number-of-layers-and-nodes-in-a-neural-network/
  
  Reply
Suprasad Kamath August 9, 2018 at 4:26 pm #

Thanks a lot, Jason, for all your posts. It has given me the confidence to write simple deep learning models using keras. This post really helped me create my first CNN model, something which I have been banging my head around for a few days. I just have few questions.

1) How did you derive the number of neurons for each hidden layer? I know we can do hyperparameter tuning using GridSearchCV from scikit-learn as detailed out in your other blog post. I am able to tune the neurons for the input layer but I get an error when I try to tune the subsequent layers.

2) How can I check if my model is overfitting or underfitting? I know you have used Dropout to hep reduce overfitting. But is there a way I can be sure if my model is overfitting or underfitting?

3) Why have you added two Conv2D layers followed by just one MaxPooling2D layer? I was under the impression that max pooling has to be done after every convolution layer. Can you please put some more light on this?

4) I am using tensorflow-gpu as my backend. In order for my gpu to take the processing load I have to increase the batch_size. I am just curious to know how batch_size affects accuracy while there is a significant boost to processing speed.

Reply
- Jason Brownlee August 10, 2018 at 6:10 am #
  
  Well done!
  
  Model configuration is trial and error, learn more here:
  https://machinelearningmastery.com/faq/single-faq/how-many-layers-and-nodes-do-i-need-in-my-neural-network
  
  Learning curves can diagnose overfitting, learn more here:
  https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-overfitting-and-underfitting
  
  There are few rules in model design, use what works best for your problem.
  
  Batch size can speed up or slow down the rate at which the model learns, you learn here:
  https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-a-batch-and-an-epoch
  
  Reply
Suprasad Kamath August 11, 2018 at 1:35 pm #

Thank you, sir. I will go through these links too.

I do have one small suggestion. The example you have used did not have many steps to prepare the training and test datasets. What if someone wanted to use their own image files to create a CNN model instead of the standard CIFAR-10? This is where I had a lot of problems, something which is not as straightforward as other supervised ML codes. I finally figured it out after a lot of trial and error. It would really help beginners like me who visit your blog posts if an example was provided (if you have not already written a post on this) where own data is used to build a model. This is just my personal opinion as only I know how hard it was for me to write an end to end program. I am assuming there would be many more like me who are new to deep learning.

Thanks again for all your blog posts.

Reply
- Jason Brownlee August 12, 2018 at 6:30 am #
  
  Great suggestion, thanks.
  
  Reply
Deepanshu Agarwal August 13, 2018 at 3:05 pm #

Hi Jason,

I am working on a similar project, in which i have to determine whether handwritten signature is present or not in a scanned image.

Could you please suggest some approach solve this problem, to get yes or no as output for signature presence.

Thanks

Reply
- Jason Brownlee August 14, 2018 at 6:14 am #
  
  You might be able to solve it with classical computer vision techniques>, maybe you don’t need a learned model?
  
  E.g. if white space on document is not empty, then there is a signature.
  
  Reply
Dhiraj August 27, 2018 at 7:47 pm #

Hi Jason,

Your article helped me a lot to understand the implementation of the convolutional neural net.
Thank you.

I want to know can I implement the same concept to differentiate between different types of coin using CNN ?

Reply
- Jason Brownlee August 28, 2018 at 5:59 am #
  
  Probably.
  
  Reply
YASH MEHTA October 19, 2018 at 8:44 pm #

Hi Jason,

I am doing detecting the water leakage on raod using CNN with kears library. How to decide the no of convolution layer to be applied and the values that are there in the conv2d function. From where i can get to understang this basics of this things. The dataset of water leakage is not available so i myself downloading the images creating the dataset so how much data i need to get better result. Can i train model using 400 images because i m not able to find more images

Reply
- Jason Brownlee October 20, 2018 at 5:55 am #
  
  I recommend testing a suite of different model configurations in order to discover what works best for your specific problem.
  
  Reply
Arjun November 10, 2018 at 1:54 pm #

Hi Jason…this is a great post on CNN and thanks for this awsome post.
I would be glad if u could tell me how can we implement object localisation on this…..Thanks in advance????

Reply
- Jason Brownlee November 11, 2018 at 5:56 am #
  
  Thanks for the suggestion.
  
  Reply
prisilla January 8, 2019 at 11:16 pm #

Hi Jason,

I trying evaluate the total accuracy

my piece of code is as follows
training_set = train_datagen.flow_from_directory(‘dataset/training_set’,
target_size = (64, 64),
batch_size = batch_size,
class_mode = ‘binary’)
test_set = test_datagen.flow_from_directory(‘dataset/test_set’,
target_size = (64, 64),
batch_size = batch_size,
class_mode = ‘binary’)
H = classifier.fit_generator(training_set,
steps_per_epoch = 2000,
epochs = 2,
validation_data = None)

# Making Predictions
import numpy as np
from keras.preprocessing import image
test_image = image.load_img(r’dataset\dog.4028.jpg’, target_size = (64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
acc = classifier.evaluate(training_set,test_image, verbose = 0)
print(‘Test accuracy:’, acc)

Two statements:

scores = model.evaluate(X_test, y_test, verbose=0)
print(“Accuracy: %.2f%%” % (scores[1]*100))

are replaced by following two statement

acc = classifier.evaluate(training_set,test_image, verbose = 0)
print(‘Test accuracy:’, acc)

but i am not getting the accuracy it is giving error. what is wrong in the syntax

Reply
- Jason Brownlee January 9, 2019 at 8:46 am #
  
  Perhaps try posting your code and error to stackoverflow?
  
  Reply
ajith February 6, 2019 at 3:11 pm #

Hy, strated with deeplearing
Cats and dogs
But my accuracy is stuck @50%
I tried droupout, regulations and data ammune

Reply
- Jason Brownlee February 7, 2019 at 6:35 am #
  
  Here are some suggestions:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Amit February 8, 2019 at 3:19 am #

Hi Jason,
I am new to Machine Learning & wanted to understand the first layer weight dimension calculation for 3 channels. Modified the first layer like
—————–
model.add(Conv2D(8, (5, 5), input_shape=(32, 32, 3), padding=’same’, activation=’relu’, kernel_constraint=maxnorm(3)))
—————–
and displayed below weight matrix dimension.
—————–
len(model.layers[0].get_weights()[0]),
len(model.layers[0].get_weights()[0][0]),
len(model.layers[0].get_weights()[0][0][0]),
len(model.layers[0].get_weights()[0][0][0][0])
—————–

The result is
—————
(5, 5, 3, 8)
—————

what have learned so far theoritically , 5*5 filter will convolve to 32*32 having stride 1*1 & padding same, it will result into 28*28 for each channel.

addition of such result matrix(28*28) will happen for all three channels, still the result matrix will be 28*28.

finally for all kernels (8) total weight dimension will be 8*28*28

But how it is getting calculated as (5, 5, 3, 8)?

Regards

Reply
- Jason Brownlee February 8, 2019 at 7:56 am #
  
  The filters are the weights, e.g. 8 5×5 filters with 3 channels.
  
  Reply
Vatsal March 8, 2019 at 9:39 pm #

Hi Json,
can you tell or send me a link on how to save the model and use it to predict some new images,?

Reply
- Jason Brownlee March 9, 2019 at 6:27 am #
  
  Yes, see this post on saving the model:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  See this post on making a prediction:
  https://machinelearningmastery.com/how-to-make-classification-and-regression-predictions-for-deep-learning-models-in-keras/
  
  Reply
Sakthi Dasan Sekar May 14, 2019 at 8:40 pm #

This is an image classification problem and not object recognition/detection.

Reply
- Jason Brownlee May 15, 2019 at 8:12 am #
  
  It is the classification of photos of objects.
  
  Indeed, it is not object recognition.
  
  Reply

Mbunga Blaise July 8, 2019 at 7:50 pm #

Hi Jason, It ‘s a pleasure to read all your tutorials concerning Machine Learning and deep learning. I would like to thank you for all your tutorials and the responses you bring to us. I have created two blocks of CNN using Keras functionnal API to extract best representations of Smiles (drugs sequence) and Proteines sequences, then I concanated the two outputs from the 2 blocs, dropped out and gave it to dense layers. When I printed the summary of the model. it was fine, however when I try to fit the model, the screen displayed the following error message:

ValueError: Error when checking input: expected input_8 to have shape (103, 64) but got array with shape (31824, 72)
Below is my short code

# Defining inputs 
XDinput = Input(shape=(max_smi_len, charsmiset_size))
XTinput = Input(shape=(max_seq_len, charseqset_size))

smi = Conv1D(filters=32, kernel_size=4,  activation='relu', padding='valid',  strides=1)(XDinput)
out1 = GlobalMaxPooling1D()(smi)
#out_f1 = Flatten(input_shape=(103, 64))(out1)
# Defining input 2
    
spro = Conv1D(filters=32, kernel_size=4,  activation='relu', padding='valid',  strides=1)(XTinput)
out2 = GlobalMaxPooling1D()(spro)
#out_f2 = Flatten(input_shape=(2549, 24))(out2)

# Merging subnetworks
interaction = keras.layers.concatenate([out1, out2], axis=1)

# Final Dense layer and compilation
# Fully connected 
#flatten_layer = tf.keras.layers.Flatten(input_shape=(28, 28))
FC1 = Dense(1024, activation='relu')(interaction)
FC2 = Dropout(0.1)(FC1)
predictions = Dense(1, kernel_initializer='normal')(FC2)

interactionModel = Model(inputs=[XDinput, XTinput], outputs=[predictions])

interactionModel.compile(loss='mean_squared_error', optimizer='adam',  metrics=['accuracy'])

print(interactionModel.summary())

interactionModel.fit(([np.array(train_),np.array(train_xp)]), np.array(train_Y), epochs=10, batch_size=64, verbose=0)

# Defining inputs

XDinput = Input(shape=(max_smi_len, charsmiset_size))

XTinput = Input(shape=(max_seq_len, charseqset_size))

smi = Conv1D(filters=32, kernel_size=4, activation='relu', padding='valid', strides=1)(XDinput)

out1 = GlobalMaxPooling1D()(smi)

#out_f1 = Flatten(input_shape=(103, 64))(out1)

# Defining input 2

spro = Conv1D(filters=32, kernel_size=4, activation='relu', padding='valid', strides=1)(XTinput)

out2 = GlobalMaxPooling1D()(spro)

#out_f2 = Flatten(input_shape=(2549, 24))(out2)

# Merging subnetworks

interaction = keras.layers.concatenate([out1, out2], axis=1)

# Final Dense layer and compilation

# Fully connected

#flatten_layer = tf.keras.layers.Flatten(input_shape=(28, 28))

FC1 = Dense(1024, activation='relu')(interaction)

FC2 = Dropout(0.1)(FC1)

predictions = Dense(1, kernel_initializer='normal')(FC2)

interactionModel = Model(inputs=[XDinput, XTinput], outputs=[predictions])

interactionModel.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

print(interactionModel.summary())

interactionModel.fit(([np.array(train_),np.array(train_xp)]), np.array(train_Y), epochs=10, batch_size=64, verbose=0)

Regarding the error message, could you be so kind to tell me what is wrong

Jason Brownlee July 9, 2019 at 8:08 am #

This is a common question that I answer here:
https://machinelearningmastery.com/faq/single-faq/can-you-read-review-or-debug-my-code

Reply

Peter August 30, 2019 at 2:16 am #

Hello Jason
I was wondering if loading all the images would at some point make the memory collapse.
So .. is there any way in Keras for reading the images direcly from a folder not loading all of them at once into memory?
Regards

Reply
- Jason Brownlee August 30, 2019 at 6:28 am #
  
  If you don’t have enough RAM, you can use progress loading with a data generator described here:
  https://machinelearningmastery.com/how-to-load-large-datasets-from-directories-for-deep-learning-with-keras/
  
  Reply
tiennguyen February 23, 2021 at 12:14 pm #

Thank for your sharing, Jason Brownlee.
I have a question: Why this tutorial, we do not use Reshape() function for data like example https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/. More detail:
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape((X_train.shape[0], num_pixels)).astype(‘float32’)
X_test = X_test.reshape((X_test.shape[0], num_pixels)).astype(‘float32’)

Reply
- Jason Brownlee February 23, 2021 at 1:25 pm #
  
  Because the input data already has color channels and is ready to model directly.
  
  Reply
tiennguyen February 23, 2021 at 12:17 pm #

Sorry, I post wrong details. Correctly detail which I want to show with you:

# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# reshape to be [samples][width][height][channels]
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype(‘float32’)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype(‘float32’)

Reply
Arpit Gupta December 5, 2021 at 8:52 pm #

Hi In this example you mentioned Data augmentation but not used this blog . if we use augmentation in current exmple is it increase the performnce?

Reply
- Adrian Tam December 8, 2021 at 7:36 am #
  
  Can be, especially if your augmentation can reduce the noise to the network so it learn better.
  
  Reply

Navigation

Object Classification with CNNs Using the Keras Deep Learning Library

The CIFAR-10 Problem Description

Need help with Deep Learning in Python?

Loading The CIFAR-10 Dataset in Keras

Simple Convolutional Neural Network for CIFAR-10

Larger Convolutional Neural Network for CIFAR-10

Extensions to Improve Model Performance

Summary

More On This Topic

188 Responses to Object Classification with CNNs Using the Keras Deep Learning Library

Leave a Reply Click here to cancel reply.