How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)

By Jason Brownlee on April 2, 2020 in Deep Learning for Computer Vision 299

Deep Learning for Computer Vision Crash Course.
Bring Deep Learning Methods to Your Computer Vision Project in 7 Days.

We are awash in digital images from photos, videos, Instagram, YouTube, and increasingly live video streams.

Working with image data is hard as it requires drawing upon knowledge from diverse domains such as digital signal processing, machine learning, statistical methods, and these days, deep learning.

Deep learning methods are out-competing the classical and statistical methods on some challenging computer vision problems with singular and simpler models.

In this crash course, you will discover how you can get started and confidently develop deep learning for computer vision problems using Python in seven days.

Note: This is a big and important post. You might want to bookmark it.

Let’s get started.

Update Nov/2019: Updated for TensorFlow v2.0 and MTCNN v0.1.0.

How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)
Photo by oliver.dodd, some rights reserved.

Who Is This Crash-Course For?

Before we get started, let’s make sure you are in the right place.

The list below provides some general guidelines as to who this course was designed for.

Don’t panic if you don’t match these points exactly; you might just need to brush up in one area or another to keep up.

You need to know:

You need to know your way around basic Python, NumPy, and Keras for deep learning.

You do NOT need to be:

You do not need to be a math wiz!
You do not need to be a deep learning expert!
You do not need to be a computer vision researcher!

This crash course will take you from a developer that knows a little machine learning to a developer who can bring deep learning methods to your own computer vision project.

Note: This crash course assumes you have a working Python 2 or 3 SciPy environment with at least NumPy, Pandas, scikit-learn, and Keras 2 installed. If you need help with your environment, you can follow the step-by-step tutorial here:

How to Setup a Python Environment for Machine Learning and Deep Learning

Crash-Course Overview

This crash course is broken down into seven lessons.

You could complete one lesson per day (recommended) or complete all of the lessons in one day (hardcore). It really depends on the time you have available and your level of enthusiasm.

Below are the seven lessons that will get you started and productive with deep learning for computer vision in Python:

Lesson 01: Deep Learning and Computer Vision
Lesson 02: Preparing Image Data
Lesson 03: Convolutional Neural Networks
Lesson 04: Image Classification
Lesson 05: Train Image Classification Model
Lesson 06: Image Augmentation
Lesson 07: Face Detection

Each lesson could take you anywhere from 60 seconds up to 30 minutes. Take your time and complete the lessons at your own pace. Ask questions and even post results in the comments below.

The lessons might expect you to go off and find out how to do things. I will give you hints, but part of the point of each lesson is to force you to learn where to go to look for help on and about the deep learning, computer vision, and the best-of-breed tools in Python (hint: I have all of the answers on this blog, just use the search box).

Post your results in the comments; I’ll cheer you on!

Hang in there; don’t give up.

Note: This is just a crash course. For a lot more detail and fleshed out tutorials, see my book on the topic titled “Deep Learning for Computer Vision.”

Want Results with Deep Learning for Computer Vision?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Lesson 01: Deep Learning and Computer Vision

In this lesson, you will discover the promise of deep learning methods for computer vision.

Computer Vision

Computer Vision, or CV for short, is broadly defined as helping computers to “see” or extract meaning from digital images such as photographs and videos.

Researchers have been working on the problem of helping computers see for more than 50 years, and some great successes have been achieved, such as the face detection available in modern cameras and smartphones.

The problem of understanding images is not solved, and may never be. This is primarily because the world is complex and messy. There are few rules. And yet we can easily and effortlessly recognize objects, people, and context.

Deep Learning

Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.

A property of deep learning is that the performance of this type of model improves by training it with more examples and by increasing its depth or representational capacity.

In addition to scalability, another often-cited benefit of deep learning models is their ability to perform automatic feature extraction from raw data, also called feature learning.

Promise of Deep Learning for Computer vision

Deep learning methods are popular for computer vision, primarily because they are delivering on their promise.

Some of the first large demonstrations of the power of deep learning were in computer vision, specifically image classification. More recently in object detection and face recognition.

The three key promises of deep learning for computer vision are as follows:

The Promise of Feature Learning. That is, that deep learning methods can automatically learn the features from image data required by the model, rather than requiring that the feature detectors be handcrafted and specified by an expert.
The Promise of Continued Improvement. That is, that the performance of deep learning in computer vision is based on real results and that the improvements appear to be continuing and perhaps speeding up.
The Promise of End-to-End Models. That is, that large end-to-end deep learning models can be fit on large datasets of images or video offering a more general and better-performing approach.

Computer vision is not “solved” but deep learning is required to get you to the state-of-the-art on many challenging problems in the field.

Your Task

For this lesson, you must research and list five impressive applications of deep learning methods in the field of computer vision. Bonus points if you can link to a research paper that demonstrates the example.

Post your answer in the comments below. I would love to see what you discover.

In the next lesson, you will discover how to prepare image data for modeling.

Lesson 02: Preparing Image Data

In this lesson, you will discover how to prepare image data for modeling.

Images are comprised of matrices of pixel values.

Pixel values are often unsigned integers in the range between 0 and 255. Although these pixel values can be presented directly to neural network models in their raw format, this can result in challenges during modeling, such as slower than expected training of the model.

Instead, there can be great benefit in preparing the image pixel values prior to modeling, such as simply scaling pixel values to the range 0-1 to centering and even standardizing the values.

This is called normalization and can be performed directly on a loaded image. The example below uses the PIL library (the standard image handling library in Python) to load an image and normalize its pixel values.

First, confirm that you have the Pillow library installed; it is installed with most SciPy environments, but you can learn more here:

PIL/Pillow Installation Instructions

Next, download a photograph of Bondi Beach in Sydney Australia, taken by Isabell Schulz and released under a permissive license. Save the image in your current working directory with the filename ‘bondi_beach.jpg‘.

Download a Photograph of Bondi Beach (bondi_beach.jpg)

Next, we can use the Pillow library to load the photo, confirm the min and max pixel values, normalize the values, and confirm the normalization was performed.

# example of pixel normalization
from numpy import asarray
from PIL import Image
# load image
image = Image.open('bondi_beach.jpg')
pixels = asarray(image)
# confirm pixel range is 0-255
print('Data Type: %s' % pixels.dtype)
print('Min: %.3f, Max: %.3f' % (pixels.min(), pixels.max()))
# convert from integers to floats
pixels = pixels.astype('float32')
# normalize to the range 0-1
pixels /= 255.0
# confirm the normalization
print('Min: %.3f, Max: %.3f' % (pixels.min(), pixels.max()))

# example of pixel normalization

from numpy import asarray

from PIL import Image

# load image

image = Image.open('bondi_beach.jpg')

pixels = asarray(image)

# confirm pixel range is 0-255

print('Data Type: %s' % pixels.dtype)

print('Min: %.3f, Max: %.3f' % (pixels.min(), pixels.max()))

# convert from integers to floats

pixels = pixels.astype('float32')

# normalize to the range 0-1

pixels /= 255.0

# confirm the normalization

print('Min: %.3f, Max: %.3f' % (pixels.min(), pixels.max()))

Your Task

Your task in this lesson is to run the example code on the provided photograph and report the min and max pixel values before and after the normalization.

For bonus points, you can update the example to standardize the pixel values.

Post your findings in the comments below. I would love to see what you discover.

In the next lesson, you will discover information about convolutional neural network models.

Lesson 03: Convolutional Neural Networks

In this lesson, you will discover how to construct a convolutional neural network using a convolutional layer, pooling layer, and fully connected output layer.

Convolutional Layers

A convolution is the simple application of a filter to an input that results in an activation. Repeated application of the same filter to an input results in a map of activations called a feature map, indicating the locations and strength of a detected feature in an input, such as an image.

A convolutional layer can be created by specifying both the number of filters to learn and the fixed size of each filter, often called the kernel shape.

Pooling Layers

Pooling layers provide an approach to downsampling feature maps by summarizing the presence of features in patches of the feature map.

Maximum pooling, or max pooling, is a pooling operation that calculates the maximum, or largest, value in each patch of each feature map.

Classifier Layer

Once the features have been extracted, they can be interpreted and used to make a prediction, such as classifying the type of object in a photograph.

This can be achieved by first flattening the two-dimensional feature maps, and then adding a fully connected output layer. For a binary classification problem, the output layer would have one node that would predict a value between 0 and 1 for the two classes.

Convolutional Neural Network

The example below creates a convolutional neural network that expects grayscale images with the square size of 256×256 pixels, with one convolutional layer with 32 filters, each with the size of 3×3 pixels, a max pooling layer, and a binary classification output layer.

# cnn with single convolutional, pooling and output layer
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
# create model
model = Sequential()
# add convolutional layer
model.add(Conv2D(32, (3,3), input_shape=(256, 256, 1)))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.summary()

# cnn with single convolutional, pooling and output layer

from keras.models import Sequential

from keras.layers import Conv2D

from keras.layers import MaxPooling2D

from keras.layers import Flatten

from keras.layers import Dense

# create model

model = Sequential()

# add convolutional layer

model.add(Conv2D(32, (3,3), input_shape=(256, 256, 1)))

model.add(MaxPooling2D())

model.add(Flatten())

model.add(Dense(1, activation='sigmoid'))

model.summary()

Your Task

Your task in this lesson is to run the example and describe how the shape of an input image would be changed by the convolutional and pooling layers.

For extra points, you could try adding more convolutional or pooling layers and describe the effect it has on the image as it flows through the model.

Post your findings in the comments below. I would love to see what you discover.

In the next lesson, you will learn how to use a deep convolutional neural network to classify photographs of objects.

Lesson 04: Image Classification

In this lesson, you will discover how to use a pre-trained model to classify photographs of objects.

Deep convolutional neural network models may take days, or even weeks, to train on very large datasets.

A way to short-cut this process is to re-use the model weights from pre-trained models that were developed for standard computer vision benchmark datasets, such as the ImageNet image recognition tasks.

The example below uses the VGG-16 pre-trained model to classify photographs of objects into one of 1,000 known classes.

Download this photograph of a dog taken by Justin Morgan and released under a permissive license. Save it in your current working directory with the filename ‘dog.jpg‘.

Download a Photograph of a Dog (dog.jpg)

The example below will load the photograph and output a prediction, classifying the object in the photograph.

Note: The first time you run the example, the pre-trained model will have to be downloaded, which is a few hundred megabytes and make take a few minutes based on the speed of your internet connection.

# example of using a pre-trained model as a classifier
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions
from keras.applications.vgg16 import VGG16
# load an image from file
image = load_img('dog.jpg', target_size=(224, 224))
# convert the image pixels to a numpy array
image = img_to_array(image)
# reshape data for the model
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
# prepare the image for the VGG model
image = preprocess_input(image)
# load the model
model = VGG16()
# predict the probability across all output classes
yhat = model.predict(image)
# convert the probabilities to class labels
label = decode_predictions(yhat)
# retrieve the most likely result, e.g. highest probability
label = label[0][0]
# print the classification
print('%s (%.2f%%)' % (label[1], label[2]*100))

# example of using a pre-trained model as a classifier

from keras.preprocessing.image import load_img

from keras.preprocessing.image import img_to_array

from keras.applications.vgg16 import preprocess_input

from keras.applications.vgg16 import decode_predictions

from keras.applications.vgg16 import VGG16

# load an image from file

image = load_img('dog.jpg', target_size=(224, 224))

# convert the image pixels to a numpy array

image = img_to_array(image)

# reshape data for the model

image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))

# prepare the image for the VGG model

image = preprocess_input(image)

# load the model

model = VGG16()

# predict the probability across all output classes

yhat = model.predict(image)

# convert the probabilities to class labels

label = decode_predictions(yhat)

# retrieve the most likely result, e.g. highest probability

label = label[0][0]

# print the classification

print('%s (%.2f%%)' % (label[1], label[2]*100))

Your Task

Your task in this lesson is to run the example and report the result.

For bonus points, try running the example on another photograph of a common object.

Post your findings in the comments below. I would love to see what you discover.

In the next lesson, you will discover how to fit and evaluate a model for image classification.

Lesson 05: Train Image Classification Model

In this lesson, you will discover how to train and evaluate a convolutional neural network for image classification.

The Fashion-MNIST clothing classification problem is a new standard dataset used in computer vision and deep learning.

It is a dataset comprised of 60,000 small square 28×28 pixel grayscale images of items of 10 types of clothing, such as shoes, t-shirts, dresses, and more.

The example below loads the dataset, scales the pixel values, then fits a convolutional neural network on the training dataset and evaluates the performance of the network on the test dataset.

The example will run in just a few minutes on a modern CPU; no GPU is required.

# fit a cnn on the fashion mnist dataset
from keras.datasets import fashion_mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
# load dataset
(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
# reshape dataset to have a single channel
trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
testX = testX.reshape((testX.shape[0], 28, 28, 1))
# convert from integers to floats
trainX, testX = trainX.astype('float32'), testX.astype('float32')
# normalize to range 0-1
trainX,testX  = trainX / 255.0, testX / 255.0
# one hot encode target values
trainY, testY = to_categorical(trainY), to_categorical(testY)
# define model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# fit model
model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=2)
# evaluate model
loss, acc = model.evaluate(testX, testY, verbose=0)
print(loss, acc)

# fit a cnn on the fashion mnist dataset

from keras.datasets import fashion_mnist

from keras.utils import to_categorical

from keras.models import Sequential

from keras.layers import Conv2D

from keras.layers import MaxPooling2D

from keras.layers import Dense

from keras.layers import Flatten

# load dataset

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# reshape dataset to have a single channel

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

# convert from integers to floats

trainX, testX = trainX.astype('float32'), testX.astype('float32')

# normalize to range 0-1

trainX,testX = trainX / 255.0, testX / 255.0

# one hot encode target values

trainY, testY = to_categorical(trainY), to_categorical(testY)

# define model

model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

model.add(MaxPooling2D())

model.add(Flatten())

model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))

model.add(Dense(10, activation='softmax'))

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# fit model

model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=2)

# evaluate model

loss, acc = model.evaluate(testX, testY, verbose=0)

print(loss, acc)

Your Task

Your task in this lesson is to run the example and report the performance of the model on the test dataset.

For bonus points, try varying the configuration of the model, or try saving the model and later loading it and using it to make a prediction on new grayscale photographs of clothing.

Post your findings in the comments below. I would love to see what you discover.

In the next lesson, you will discover how to use image augmentation on training data.

Lesson 06: Image Augmentation

In this lesson, you will discover how to use image augmentation.

Image data augmentation is a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset.

Training deep learning neural network models on more data can result in more skillful models, and the augmentation techniques can create variations of the images that can improve the ability of the fit models to generalize what they have learned to new images.

The Keras deep learning neural network library provides the capability to fit models using image data augmentation via the ImageDataGenerator class.

Download a photograph of a bird by AndYaDontStop, released under a permissive license. Save it into your current working directory with the name ‘bird.jpg‘.

Download a photograph of a Bird (bird.jpg)

The example below will load the photograph as a dataset and use image augmentation to create flipped and rotated versions of the image that can be used to train a convolutional neural network model.

# example using image augmentation
from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot
# load the image
img = load_img('bird.jpg')
# convert to numpy array
data = img_to_array(img)
# expand dimension to one sample
samples = expand_dims(data, 0)
# create image data augmentation generator
datagen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rotation_range=90)
# prepare iterator
it = datagen.flow(samples, batch_size=1)
# generate samples and plot
for i in range(9):
     # define subplot
     pyplot.subplot(330 + 1 + i)
     # generate batch of images
     batch = it.next()
     # convert to unsigned integers for viewing
     image = batch[0].astype('uint32')
     # plot raw pixel data
     pyplot.imshow(image)
# show the figure
pyplot.show()

# example using image augmentation

from numpy import expand_dims

from keras.preprocessing.image import load_img

from keras.preprocessing.image import img_to_array

from keras.preprocessing.image import ImageDataGenerator

from matplotlib import pyplot

# load the image

img = load_img('bird.jpg')

# convert to numpy array

data = img_to_array(img)

# expand dimension to one sample

samples = expand_dims(data, 0)

# create image data augmentation generator

datagen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rotation_range=90)

# prepare iterator

it = datagen.flow(samples, batch_size=1)

# generate samples and plot

for i in range(9):

# define subplot

pyplot.subplot(330 + 1 + i)

# generate batch of images

batch = it.next()

# convert to unsigned integers for viewing

image = batch[0].astype('uint32')

# plot raw pixel data

pyplot.imshow(image)

# show the figure

pyplot.show()

Your Task

Your task in this lesson is to run the example and report the effect that the image augmentation has had on the original image.

For bonus points, try additional types of image augmentation, supported by the ImageDataGenerator class.

Post your findings in the comments below. I would love to see what you find.

In the next lesson, you will discover how to use a deep convolutional network to detect faces in photographs.

Lesson 07: Face Detection

In this lesson, you will discover how to use a convolutional neural network for face detection.

Face detection is a trivial problem for humans to solve and has been solved reasonably well by classical feature-based techniques, such as the cascade classifier.

More recently, deep learning methods have achieved state-of-the-art results on standard face detection datasets. One example is the Multi-task Cascade Convolutional Neural Network, or MTCNN for short.

The ipazc/MTCNN project provides an open source implementation of the MTCNN that can be installed easily as follows:

sudo pip install mtcnn

1	sudo pip install mtcnn

Download a photograph of a person on the street taken by Holland and released under a permissive license. Save it into your current working directory with the name ‘street.jpg‘.

Download a Photograph of a Person on the Street (street.jpg)

The example below will load the photograph and use the MTCNN model to detect faces and will plot the photo and draw a box around the first detected face.

# face detection with mtcnn on a photograph
from matplotlib import pyplot
from matplotlib.patches import Rectangle
from mtcnn.mtcnn import MTCNN
# load image from file
pixels = pyplot.imread('street.jpg')
# create the detector, using default weights
detector = MTCNN()
# detect faces in the image
faces = detector.detect_faces(pixels)
# plot the image
pyplot.imshow(pixels)
# get the context for drawing boxes
ax = pyplot.gca()
# get coordinates from the first face
x, y, width, height = faces[0]['box']
# create the shape
rect = Rectangle((x, y), width, height, fill=False, color='red')
# draw the box
ax.add_patch(rect)
# show the plot
pyplot.show()

# face detection with mtcnn on a photograph

from matplotlib import pyplot

from matplotlib.patches import Rectangle

from mtcnn.mtcnn import MTCNN

# load image from file

pixels = pyplot.imread('street.jpg')

# create the detector, using default weights

detector = MTCNN()

# detect faces in the image

faces = detector.detect_faces(pixels)

# plot the image

pyplot.imshow(pixels)

# get the context for drawing boxes

ax = pyplot.gca()

# get coordinates from the first face

x, y, width, height = faces[0]['box']

# create the shape

rect = Rectangle((x, y), width, height, fill=False, color='red')

# draw the box

ax.add_patch(rect)

# show the plot

pyplot.show()

Your Task

Your task in this lesson is to run the example and describe the result.

For bonus points, try the model on another photograph with multiple faces and update the code example to draw a box around each detected face.

Post your findings in the comments below. I would love to see what you discover.

The End!
(Look How Far You Have Come)

You made it. Well done!

Take a moment and look back at how far you have come.

You discovered:

What computer vision is and the promise and impact that deep learning is having on the field.
How to scale the pixel values of image data in order to make them ready for modeling.
How to develop a convolutional neural network model from scratch.
How to use a pre-trained model to classify photographs of objects.
How to train a model from scratch to classify photographs of clothing.
How to use image augmentation to create modified copies of photographs in your training dataset.
How to use a pre-trained deep learning model to detect people’s faces in photographs.

This is just the beginning of your journey with deep learning for computer vision. Keep practicing and developing your skills.

Take the next step and check out my book on deep learning for computer vision.

Summary

How Did You Do With The Mini-Course?
Did you enjoy this crash course?

Do you have any questions? Were there any sticking points?
Let me know. Leave a comment below.

299 Responses to How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)

Abid Rizvi April 11, 2019 at 1:53 pm #

Lesson 02: Preparing Image Data
================================

Before Normalization:

Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

After Normalization:
Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee April 11, 2019 at 2:21 pm #
  
  Well done.
  
  Reply
- Claudio Lombardi September 25, 2020 at 11:41 am #
  
  After Normalization
  Data Type: uint8
  Min: 0.000, Max: 255.000
  Min: 0.000, Max: 1.000
  Mean: 0.610, Std: 0.203
  
  Reply
  - Jason Brownlee September 25, 2020 at 2:46 pm #
    
    Well done!
    
    Reply
Abid Rizvi April 11, 2019 at 1:57 pm #

Lesson 02: Preparing Image Data
================================

Before Normalization:
Min: 0.000, Max: 255.000

After Normalization:
Min: 0.000, Max: 1.000

Max value 255 converts to 1.000

Reply
- Jason Brownlee April 11, 2019 at 2:21 pm #
  
  Nice work.
  
  Reply
Abid Rizvi April 11, 2019 at 1:59 pm #

Lesson 02: Preparing Image Data
===============================
For bonus points, you can update the example to standardize the pixel values.

What do you mean by standardize the pixel values? Please elaborate.

Reply
- Jason Brownlee April 11, 2019 at 2:21 pm #
  
  This post explains more:
  https://machinelearningmastery.com/how-to-normalize-center-and-standardize-images-with-the-imagedatagenerator-in-keras/
  
  Reply
Abid Rizvi April 11, 2019 at 2:18 pm #

Lesson 03: Convolutional Neural Networks
=========================================

input_shape=(256, 256, 1)

Convolutional Layer 1 (filter size 3×3)
————————————–
model.add(Conv2D(32, (3,3), input_shape=(256, 256, 1)))

Output shape: (None, 254, 254, 32)
Max Pooling: (None, 127, 127, 32)

Convolutional Layer 2 (filter size 3×3)
————————————–
model.add(Conv2D(32, (3,3)))

Output shape: (None, 252, 252, 32)
Max Pooling: (None, 126, 126, 32)

Convolutional Layer 3 (filter size 7×7)
————————————–
model.add(Conv2D(32, (7,7)))

Output shape: (None, 246, 246, 32)
Max Pooling: (None, 125, 125, 32)

Reply
- Bort June 14, 2019 at 10:46 pm #
  
  This seems pretty wrong to me as the maxPooling shape is used as input for the next layer.
  So you would go from 256 -> 254 -> 127 -> 125 -> 62 -> 56 -> 28
  
  Furthermore, as far as I understand it, the number of filters usually increases.
  32 -> 64 -> 128
  
  Reply
  - Jason Brownlee June 15, 2019 at 6:34 am #
    
    Nice tip.
    
    Reply
- sari June 13, 2020 at 4:04 pm #
  
  lesson 3:
  
  how to add more no.of Constitutional Layer & how to modify pooling?
  
  Reply
  - sari June 13, 2020 at 4:05 pm #
    
    how to add more no.of Convolution Layer & how to modify pooling?
    
    Reply
    - Jason Brownlee June 14, 2020 at 6:31 am #
      
      See the tutorials here:
      https://machinelearningmastery.com/start-here/#dlfcv
      
      Reply
  - Jason Brownlee June 14, 2020 at 6:31 am #
    
    More on pooling here:
    https://machinelearningmastery.com/pooling-layers-for-convolutional-neural-networks/
    
    Reply
Abid Rizvi April 11, 2019 at 5:01 pm #

Lesson 04: Lesson 04: Image Classification
=========================================
Doberman (33.59%)

Lesson 04: Lesson 04: Image Classification
=========================================
Doberman (33.59%)

some other image (I downloaded two image one for a dog and another for human)

Dog result:
German_shepherd (87.66%)

Human result:
swimming_trunks (15.77%)

Reply
- Jason Brownlee April 12, 2019 at 7:39 am #
  
  Well done!
  
  Reply
- sari June 13, 2020 at 4:01 pm #
  
  lesson 4:
  
  result Doberman (33.59%)
  
  i have given other dog image i got result as Labrador retriever.i didn’t get any output when i gave human image
  
  Reply
  - Jason Brownlee June 14, 2020 at 6:30 am #
    
    Nice work!
    
    Reply
Abid Rizvi April 12, 2019 at 4:39 pm #

Lesson 05: Train Image Classification Model
============================================
(Yesterday after loading running the example in first go)
Loss: 0.318009318998456
Accuracy: 0.912

(Today after loading loading model from saved model/weights)
Loss: 0.30647955482006073
Accuracy: 0.9113

Why there is some light changes in the third decimals of Loss & Accuracy?

Reply
- Jason Brownlee April 13, 2019 at 6:22 am #
  
  Well done!
  
  Differences are to be expected, see this:
  https://machinelearningmastery.com/faq/single-faq/why-do-i-get-different-results-each-time-i-run-the-code
  
  Reply
Turyamusiima Dismas April 13, 2019 at 2:49 am #

before normalisation

Min: 0.000, Max: 255.000

after normalisation

Min: 0.000, Max: 1.000

Reply
- Jason Brownlee April 13, 2019 at 6:38 am #
  
  Very nice!
  
  Reply
Jamuna Prakash April 13, 2019 at 9:59 am #

Lesson#03’s output:

Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee April 13, 2019 at 1:48 pm #
  
  Nice work.
  
  Reply
Nitin May 6, 2019 at 6:55 pm #

Lesson 02: Preparing Image Data
================================
Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee May 7, 2019 at 6:15 am #
  
  Nice work!
  
  Reply
jk (jayakumar) May 25, 2019 at 8:57 pm #

import tensorflow as tf
print(tf.__version__)

2.0.0-alpha0
In this version of tensorflow, and lesson 3 code

from keras.models import Sequential
model = Sequential()

lead to following Error.
AttributeError: module ‘tensorflow’ has no attribute ‘get_default_graph

Then there is need to change
#from keras.models import Sequential

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense

Reply
- Jason Brownlee May 26, 2019 at 6:44 am #
  
  I recommend using the Keras library directly, not the keras interface in tensorflow.
  
  Reply
Marcello October 15, 2019 at 4:12 am #

Image Classification https://arxiv.org/abs/1512.03385
Image Classification With Localization https://arxiv.org/abs/1311.2524
Object Detection https://arxiv.org/abs/1506.02640
Object Segmentation https://ieeexplore.ieee.org/document/7803544
Image Style Transfer https://ieeexplore.ieee.org/document/7780634
Image Colorization
Image Reconstruction
Image Super-Resolution
Image Synthesis

Reply
- Jason Brownlee October 15, 2019 at 6:24 am #
  
  Nice work!
  
  Reply
Saurabh December 12, 2019 at 11:11 pm #

Hello Jason,

Thanks for sharing mini course.

I am trying to run MTCNN on tensorflow 2.0 and throws error: module ‘tensorflow’ has no attribute ‘get_default_graph’

I cross verified my opencv-python version i.e. 4.1.2 and MTCNN version 0.1.0.

Could you please guide me?

Thanking you,
Saurabh

Reply
- Jason Brownlee December 13, 2019 at 6:03 am #
  
  You must use TF1.15 or TF1.14 with Mask RCNN.
  
  Reply
  - Saurabh December 13, 2019 at 7:51 pm #
    
    Thank you! It means MTCNN is not supported by TF2.0? Right?
    
    Reply
    - Jason Brownlee December 14, 2019 at 6:15 am #
      
      Yes, I recommend TF2. MTCNN uses TF2.
      
      Reply
      - Saurabh December 16, 2019 at 8:00 pm #
        
        Thank you!
sara ahmed January 20, 2020 at 9:40 pm #

1- MNIST dataset.
2- detecting Alzheimer’s disease using CNN
3- image segmentation using semantic segmentation
4- image classification using 3D-CNN and autoencoder

Reply
- Jason Brownlee January 21, 2020 at 7:12 am #
  
  Very nice!
  
  Reply
Vasudha February 24, 2020 at 11:52 pm #

Lesson 1: Deep Learning and computer vision
______________________________________

1. Object Detection
(W. Ouyang et al., “DeepID-Net: Object Detection with Deformable Part Based Convolutional Neural Networks,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 7, pp. 1320-1334, 1 July 2017.)

2. Face detection and recognition
(https://www.researchgate.net/publication/255653401)

3. Action and Activity recognition (http://yann.lecun.com/exdb/publis/pdf/lecun-90c.pdf)

4. Human Pose estimation ( 3D Human Pose Estimation Using Convolutional Neural Networks with 2D Pose Information – https://link.springer.com/chapter/10.1007/978-3-319-49409-8_15)

5. Datasets / Images (https://www.researchgate.net/publication/275257620_Image_Classification_Using_Convolutional_Neural_Networks)

Reply
- Jason Brownlee February 25, 2020 at 7:47 am #
  
  Well done!
  
  Reply
Vasudha February 27, 2020 at 10:39 pm #

Lesson 02: Preparing Image Data
———————————————–

Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee February 28, 2020 at 6:07 am #
  
  Well done!
  
  Reply
Vasudha February 27, 2020 at 10:47 pm #

Lesson 3 : Convolutional Neural Networks
———————————————————-
Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 127, 127, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 516128) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0

Reply
- Jason Brownlee February 28, 2020 at 6:07 am #
  
  Excellent.
  
  Reply
Vasudha February 28, 2020 at 12:17 am #

Lesson 3: Convolutional neural networks / For extra points
———————————————————–
Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 126, 126, 32) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 63, 63, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 127008) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 127009
=========================================Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 126, 126, 32) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 63, 63, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 127008) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 127009
=================================================================
Total params: 136,577
Trainable params: 136,577
Non-trainable params: 0
========================
Total params: 136,577
Trainable params: 136,577
Non-trainable params: 0

Reply
- Jason Brownlee February 28, 2020 at 6:12 am #
  
  Well done!
  
  Reply
Vasudha February 28, 2020 at 1:00 am #

Lesson 3 : Image Classification
—————————————–
Doberman (33.59%)

I downloaded 2 other images. One of a flower and the other one of a cat. Below are the results.

1. vase (44.59%)

2. Egyptian_cat (56.30%)

Reply
- Jason Brownlee February 28, 2020 at 6:14 am #
  
  Well done.
  
  Reply
Vasudha February 28, 2020 at 1:18 am #

Lesson 05: Train Image Classification Model
———————————————————–

Epoch 1/10
– 27s – loss: 0.4170 – accuracy: 0.8525
Epoch 2/10
– 25s – loss: 0.2761 – accuracy: 0.8993
Epoch 3/10
– 27s – loss: 0.2326 – accuracy: 0.9144
Epoch 4/10
– 25s – loss: 0.1991 – accuracy: 0.9274
Epoch 5/10
– 24s – loss: 0.1747 – accuracy: 0.9350
Epoch 6/10
– 24s – loss: 0.1501 – accuracy: 0.9447
Epoch 7/10
– 24s – loss: 0.1308 – accuracy: 0.9520
Epoch 8/10
– 24s – loss: 0.1120 – accuracy: 0.9587
Epoch 9/10
– 24s – loss: 0.0982 – accuracy: 0.9636
Epoch 10/10
– 24s – loss: 0.0839 – accuracy: 0.9696
0.3199548319131136 0.9136000275611877

Reply
VICENTE CASTILLO GUILLÉN March 23, 2020 at 1:10 am #

Classification for
Architectural Design through the Eye of Artificial
Intelligence: https://arxiv.org/ftp/arxiv/papers/1812/1812.01714.pdf

Measuring human perceptions of a large-scale urban region using machine learning:
https://www.researchgate.net/publication/327720319_Measuring_human_perceptions_of_a_large-scale_urban_region_using_machine_learning

Classification of Mexican heritage buildings’ architectural styles: https://dl.acm.org/doi/abs/10.1145/3095713.3095730

A deep convolutional network for fine-art paintings
classification:
http://www.cs-chan.com/doc/ICIP2016_Poster.pdf

Architectural Style Classification of Building Facade Windows: https://link.springer.com/chapter/10.1007/978-3-642-24031-7_28

Reply
- Jason Brownlee March 23, 2020 at 6:14 am #
  
  Well done!
  
  Reply
Elifuraha Gerard March 24, 2020 at 7:43 am #

Thanks Jason for the very clear instructions.
For lesson 2 quiz I used the mumpy library as follows:

import numpy as np

I then used np.array() to convert the image into numpy array and employed the short cut below to standardize the image as follows:

image = Image.open(‘bondi_beach.jpg’)
pixels = asarray(image)
pixels = pixels.astype(‘float32’)
# Convert to numpy array data type
pixels_np = np.array(pixels)
print(‘Min: %.3f, Max: %.3f’ % (pixels_np.min(), pixels_np.max()))

>Min: 0.000, Max: 1.000000

#stadardize the image

standardized_pixels_np = (pixels_np-pixels_np.mean())/pixels_np.std()

# confirm the standardization
print(‘Min: %.3f, Max: %.3f’ % (standardized_pixels_np.min(), standardized_pixels_np.max()))
> Min: -3.003, Max: 1.920

Gerard

Reply
- Jason Brownlee March 24, 2020 at 8:01 am #
  
  Well done.
  
  Reply
VICENTE CASTILLO GUILLÉN March 28, 2020 at 10:32 pm #

Lesson 03: This is what I get

Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 127, 127, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 516128) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0
_________________________________________________________________

Reply
- Jason Brownlee March 29, 2020 at 5:54 am #
  
  Well done.
  
  Reply
VICENTE CASTILLO GUILLÉN March 28, 2020 at 11:04 pm #

Lesson 03: Extra

Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
conv2d_3 (Conv2D) (None, 250, 250, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 125, 125, 32) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 62, 62, 32) 0
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 31, 31, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 30752) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 30753
=================================================================
Total params: 49,569
Trainable params: 49,569
Non-trainable params: 0
_________________________________________________________________

Reply
VICENTE CASTILLO GUILLÉN March 28, 2020 at 11:11 pm #

Lesson 03:

Is this correct?

In the basic code you provide the 1st convolution uses a 3×3 kernel to transform the image from 256×256 and 1 channel, to 254×254 and 32 channels.

The 2nd convolution transforms the image to a size of 127×127 pixels.

The 3rd one, flatten, adds is the sum of all the parameters of the matrix.

The “dense” convolution is the one which classifies (0 or 1).

I have a question: What’s the meaning of the #320 param? Why it transforms into 0 and then it changes in the final convolution to 519129?

Thanks Jason for your help!

This link was helpful for me to understand what a keras is and how it works:
https://www.pyimagesearch.com/2018/12/31/keras-conv2d-and-convolutional-layers/

Reply
- Jason Brownlee March 29, 2020 at 5:55 am #
  
  “Param” is parameters and is the number of weights in the layer.
  
  Perhaps this will help for conv layers:
  https://machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/
  
  Reply
VICENTE CASTILLO GUILLÉN March 29, 2020 at 12:04 am #

Lesson 04: Doberman (33.59%)

Reply
- Jason Brownlee March 29, 2020 at 5:56 am #
  
  Nicely done.
  
  Reply
VICENTE CASTILLO GUILLÉN March 29, 2020 at 12:05 am #

Lesson 05:

Epoch 1/10
– 23s – loss: 0.3851 – accuracy: 0.8624
Epoch 2/10
– 24s – loss: 0.2594 – accuracy: 0.9060
Epoch 3/10
– 24s – loss: 0.2172 – accuracy: 0.9191
Epoch 4/10
– 24s – loss: 0.1847 – accuracy: 0.9325
Epoch 5/10
– 24s – loss: 0.1586 – accuracy: 0.9405
Epoch 6/10
– 25s – loss: 0.1368 – accuracy: 0.9495
Epoch 7/10
– 24s – loss: 0.1171 – accuracy: 0.9567
Epoch 8/10
– 24s – loss: 0.1029 – accuracy: 0.9619
Epoch 9/10
– 24s – loss: 0.0885 – accuracy: 0.9679
Epoch 10/10
– 25s – loss: 0.0759 – accuracy: 0.9729
0.3183996982872486 0.9110999703407288

Reply
VICENTE CASTILLO GUILLÉN March 29, 2020 at 2:49 am #

Lesson 5 extra:

Following the instructions included in this tutorial:

https://machinelearningmastery.com/how-to-develop-a-cnn-from-scratch-for-fashion-mnist-clothing-classification/

I could ran the example and I got the right class: 2.

You need to add this line to the code provided in Lesson 5:
# save model
model.save(‘final_model.h5’)

And you’ll have to save the image in the tutorial I mentioned before as: ‘sample_image.png’

Then open and run a new file with the following code:

# make a prediction for a new image.
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import load_model

# load and prepare the image
def load_image(filename):
# load the image
img = load_img(filename, grayscale=True, target_size=(28, 28))
# convert to array
img = img_to_array(img)
# reshape into a single sample with 1 channel
img = img.reshape(1, 28, 28, 1)
# prepare pixel data
img = img.astype(‘float32’)
img = img / 255.0
return img

# load an image and predict the class
def run_example():
# load the image
img = load_image(‘sample_image.png’)
# load model
model = load_model(‘final_model.h5’)
# predict the class
result = model.predict_classes(img)
print(result[0])

# entry point, run the example
run_example()

Thanks Jason!!

Reply
- Jason Brownlee March 29, 2020 at 6:03 am #
  
  Nice work!
  
  Reply
VICENTE CASTILLO GUILLÉN March 29, 2020 at 3:06 am #

Lesson 6:

After running the code, we see 9 images similar to the original one, but with several changes:
– It has been rotated, flipped (horizontal and vertically), the background seems to have changed the direction (rotation) of the coloured areas, some areas in the perimeter have been filled with colours similar to the ones in connection to the original picture but with a kind of a “motion blur”.

Reply
- Jason Brownlee March 29, 2020 at 6:03 am #
  
  Yes, different augmentations each run of the code.
  
  Reply
VICENTE CASTILLO GUILLÉN March 29, 2020 at 8:42 pm #

Lesson 07: I got some trouble with the cv installation, but I could solve it via this link:

https://programarfacil.com/blog/vision-artificial/instalar-opencv-python-anaconda/

Reply
- Jason Brownlee March 30, 2020 at 5:32 am #
  
  Thanks for sharing.
  
  Reply
VICENTE CASTILLO GUILLÉN March 29, 2020 at 8:43 pm #

Lesson 07 extra:

This is the code;

# face detection with mtcnn on a photograph
from matplotlib import pyplot
from matplotlib.patches import Rectangle
from mtcnn.mtcnn import MTCNN
# load image from file
pixels = pyplot.imread(‘prueba.jpg’)
# create the detector, using default weights
detector = MTCNN()

# detect faces in the image
faces = detector.detect_faces(pixels)
# plot the image
pyplot.imshow(pixels)
# get the context for drawing boxes
ax = pyplot.gca()
for i in range(len(faces)):
# get coordinates from the i face
x, y, width, height = faces[i][‘box’]
# create the shape
rect = Rectangle((x, y), width, height, fill=False, color=’red’)
# draw the box
ax.add_patch(rect)
# show the plot
pyplot.show()

Reply
- Jason Brownlee March 30, 2020 at 5:32 am #
  
  Nice work!
  
  Reply
Anubrata April 15, 2020 at 5:11 am #

Hi,
Thanks for this course. I am a molecular biologist interested in Data Science and so all my examples are from biology !
1) using Denoising Autoencoders to detect Breast Cancer from gene expression data
Tan J, Ung M, Cheng C, Greene CS. Pac Symp Biocomput. 2015;20:132–143.

2) predicting protein structure without sequence info using multilayer residual neural network
Wang S, Sun S, Li Z, Zhang R, Xu J (2017) PLoS Comput Biol 13(1): e1005324.

3) Predicting drug-target interactions using restricted Boltzmann machines
Wang Y, Zeng J. 2013;29(13):i126–i134.

4) Deep learning based tissue analysis predicts outcome in colorectal cancer
Bychkov D, Linder N, Turkki R, et al. Sci Rep. 2018;8(1):3395.

5)Deep learning-based cancer survival prognosis from RNA-seq data
Huang Z, Johnson TS, Han Z, et al. BMC Med Genomics. 2020;13(Suppl 5):41.

Reply
- Jason Brownlee April 15, 2020 at 8:02 am #
  
  Nice work!
  
  Reply
Saúl Alquicira April 15, 2020 at 4:49 pm #

Lesson 1.- Deep Learning and Computer Visión

1.-      Autonomous vehic
https://arxiv.org/pdf/2001.10789.pdf

2.-     Autonomous
http://www.robots.ox.ac.uk/~mobile/Papers/ICRA19_chadwick.pdf

3.-Affective Computing
https://arxiv.org/pdf/1907.09929.pdf

4.-Improve Learning
https://www.media.mit.edu/publications/designing-neural-network-architectures-using-reinforcement-learning/

5.-healthhttps://dam-prod.media.mit.edu/x/2019/01/17/RudovicEtAl18-PML-Science.pdf

Reply
- Jason Brownlee April 16, 2020 at 5:58 am #
  
  Nice work!
  
  Reply
anubrata das April 16, 2020 at 2:12 am #

# prepare image data

Hi Jason,

i used your tutorial to standardize the data

Bondi Beach.jpg

format JPEG, mode RGB
Data Type: uint8
Min_pixel: 0.000, Max_pixel: 255.000
Min_normal_pixel: 0.000, Max_normal_pixel: 1.000
mean per channel =[0.51480323 0.60049796 0.7137792 ]
std.dev per channel=[0.22923872 0.15852204 0.16162618]
mean per channel stdz =[0.00101891 0.00115143 0.00204117]
std.dev per channel stdz =[1.0000744 1.0001055 0.9998136]

img2.jpg

format JPEG, mode RGB
Data Type: uint8
Min_pixel: 0.000, Max_pixel: 255.000
Min_normal_pixel: 0.000, Max_normal_pixel: 1.000
mean per channel =[0.5421985 0.49465644 0.48014694]
std.dev per channel=[0.21577911 0.21520971 0.24537063]
mean per channel stdz =[-0.03672315 0.06999601 0.0575752 ]
std.dev per channel stdz =[1.0012016 0.9973781 0.9994226]

img4.jpg

format JPEG, mode RGB
Data Type: uint8
Min_pixel: 0.000, Max_pixel: 255.000
Min_normal_pixel: 0.000, Max_normal_pixel: 1.000
mean per channel =[0.34658098 0.23875412 0.16520199]
std.dev per channel=[0.26760063 0.21685596 0.17311402]
mean per channel stdz =[0.03354708 0.04420278 0.03249924]
std.dev per channel stdz =[0.9971665 0.9965377 1.0025591

Reply
- Jason Brownlee April 16, 2020 at 6:04 am #
  
  Well done!
  
  Reply
- samith April 23, 2020 at 5:08 am #
  
  how did to get the mean and std.dev
  
  Reply
Saul Alquicira April 19, 2020 at 2:30 pm #

Lesson 2 .-Preparing Image Data

#standarize with (x – x.mean()) / x.std() # values from ? to ?, but mean at 0
pixels = (pixels – pixels.mean()) / pixels.std()
print (‘NORMAL Min: %.3f, Max: %.3f’ % (pixels.min(), pixels.max()))

BEFORE Min: 0.000, Max: 255.000
AFTER Min: 0.000, Max: 1.000
NORMAL Min: -3.003, Max: 1.920

Reply
- Jason Brownlee April 20, 2020 at 5:22 am #
  
  Well done!
  
  Reply
Saul Alquicira April 20, 2020 at 4:16 pm #

Lesson 3.- Convolutional Neural Networks

Model: “sequential_24”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_33 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_34 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_27 (MaxPooling (None, 126, 126, 32) 0
_________________________________________________________________
flatten_23 (Flatten) (None, 508032) 0
_________________________________________________________________
dense_23 (Dense) (None, 1) 508033
=================================================================
Total params: 517,601
Trainable params: 517,601
Non-trainable params: 0

When you add an extra pooling the params are reduced drastically and I belive because at the end is the reduction of the rectified feature map and also the reduction in the pooled feature map, at the end is a reduction in the arrays.

When you add an extra convultional or clasificator no reduce in the same way as the pooling the total params or trainable params.

Model: “sequential_25”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_35 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_36 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_28 (MaxPooling (None, 126, 126, 32) 0
_________________________________________________________________
max_pooling2d_29 (MaxPooling (None, 63, 63, 32) 0
_________________________________________________________________
flatten_24 (Flatten) (None, 127008) 0
_________________________________________________________________
dense_24 (Dense) (None, 1) 127009
=================================================================
Total params: 136,577
Trainable params: 136,577
Non-trainable params: 0

I am in process to undersand more about it, I will go with your other tutorial..

https://machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/

BR,

Reply
- Jason Brownlee April 21, 2020 at 5:45 am #
  
  Great work!
  
  Reply
Saul Alquicira April 22, 2020 at 5:35 am #

Lesson 4 .- Image Classification

Doberman (33.59%)
dingo (39.17%)
Mexican_hairless (29.25%)

Reply
- Jason Brownlee April 22, 2020 at 6:08 am #
  
  Great work!
  
  Reply
Saul Alquicira April 24, 2020 at 2:55 am #

Lesson 5.- Image Classiffication

Epoch 1/10
– 21s – loss: 0.3770 – accuracy: 0.8651
Epoch 2/10
– 25s – loss: 0.2539 – accuracy: 0.9082
Epoch 3/10
– 25s – loss: 0.2099 – accuracy: 0.9232
Epoch 4/10
– 24s – loss: 0.1782 – accuracy: 0.9349
Epoch 5/10
– 25s – loss: 0.1501 – accuracy: 0.9453
Epoch 6/10
– 28s – loss: 0.1257 – accuracy: 0.9540
Epoch 7/10
– 26s – loss: 0.1067 – accuracy: 0.9608
Epoch 8/10
– 26s – loss: 0.0908 – accuracy: 0.9667
Epoch 9/10
– 26s – loss: 0.0754 – accuracy: 0.9726
Epoch 10/10
– 27s – loss: 0.0660 – accuracy: 0.9765
0.3630220188647509 0.9085000157356262

Also I adecuate the code to reuse the model and predict the class

def run_example():
# load the image
img = load_image(‘sample_image.png’)
# load model
model = load_model(‘saul_modelh5’)
# predict the class
result = model.predict_classes(img)
print(result[0])

Reply
- Jason Brownlee April 24, 2020 at 5:50 am #
  
  Great work!
  
  Reply
Saul Aluicira April 24, 2020 at 11:48 am #

Lesson 6.- Image augmentation

I included zoom range and shear_range

datagen = ImageDataGenerator( =0.15,zoom_range=0.9, horizontal_flip=True, vertical_flip=True, rotation_range=30)
#datagen = ImageDataGenerator()

Reply
- Jason Brownlee April 24, 2020 at 1:20 pm #
  
  Nice work!
  
  Reply
D Vaishnavi May 1, 2020 at 3:22 am #

Covid-19 detection: https://arxiv.org/ftp/arxiv/papers/2003/2003.10849.pdf

Pulmonary Image Classification: https://ieeexplore.ieee.org/abstract/document/8861312

Smart Traffic Management: https://ieeexplore.ieee.org/document/8666539

Image forgery recognition: https://iopscience.iop.org/article/10.1088/1742-6596/1368/3/032028

Food and drink assessment using image recognizing. https://www.mdpi.com/2072-6643/9/7/657

Reply
- Jason Brownlee May 1, 2020 at 6:45 am #
  
  Well done!
  
  Reply
e2e4 May 4, 2020 at 8:49 pm #

Lesson 1
In satellite imaging:
Ship recognition with deep learning technique
https://appsilon.com/ship-recognition-in-satellite-imagery-part-i/

Vegetation management
https://www.20tree.ai

Forestry control
https://www.efi.int/sites/default/files/files/events/2018/innovation_workshop3-Liu.pdf

In medicine
Skin checks for cancer
https://www.skinvision.com

In urban planning and smart cities:
Deep learning for building occupancy estimation using environmental sensors
Chen, Z, Jiang, C, Masood, MK, Soh, YC, Wu, M & Li, X 2020, Deep learning for building occupancy estimation using environmental sensors. in W Pedrycz & S-M Chen (eds), Deep learning: algorithms and applications. Studies in Computational Intelligence, vol. 865, pp. 335-357. https://doi.org/10.1007/978-3-030-31760-7_11

Reply
- Jason Brownlee May 5, 2020 at 6:24 am #
  
  Well done!
  
  Reply
e2e4 May 5, 2020 at 9:07 pm #

Lesson 02
Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee May 6, 2020 at 6:25 am #
  
  Well done!
  
  Reply
e2e4 May 7, 2020 at 7:33 am #

Lesson 03
Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 126, 126, 32) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 63, 63, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 127008) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 127009
=================================================================
Total params: 136,577
Trainable params: 136,577
Non-trainable params: 0

But i need to go deeper into understanding of the process

Reply
- Jason Brownlee May 7, 2020 at 11:48 am #
  
  Well done!
  
  Reply
e2e4 May 7, 2020 at 11:47 pm #

Lesson 04

Doberman (33.59%)

English_foxhound (69.26%)

German_shepherd (99.56%)

black-and-tan_coonhound (54.60%)

99.56 for German shepherd is impressive.

For picture of a horse the result was
sorrel (100.00%)
which is a plant. How would you comment it?

Reply
- Jason Brownlee May 8, 2020 at 6:36 am #
  
  Well done.
  
  Yes, no model is perfect.
  
  Reply
e2e4 May 9, 2020 at 1:05 am #

Epoch 1/10
– 57s – loss: 0.3798 – accuracy: 0.8645
Epoch 2/10
– 56s – loss: 0.2550 – accuracy: 0.9067
Epoch 3/10
– 56s – loss: 0.2077 – accuracy: 0.9227
Epoch 4/10
– 56s – loss: 0.1761 – accuracy: 0.9343
Epoch 5/10
– 56s – loss: 0.1467 – accuracy: 0.9466
Epoch 6/10
– 56s – loss: 0.1252 – accuracy: 0.9535
Epoch 7/10
– 56s – loss: 0.1049 – accuracy: 0.9617
Epoch 8/10
– 56s – loss: 0.0898 – accuracy: 0.9670
Epoch 9/10
– 55s – loss: 0.0745 – accuracy: 0.9722
Epoch 10/10
– 55s – loss: 0.0641 – accuracy: 0.9765
0.3429171951398253 0.9118000268936157

Reply
- Jason Brownlee May 9, 2020 at 6:16 am #
  
  Nice work!
  
  Reply
e2e4 May 9, 2020 at 1:44 am #

Second run resulted in
0.32463194568455217 0.9093999862670898
3d run
0.312660645493865 0.9160000085830688

Do stohastic processes results depend on the particular hardware?

Reply
- Jason Brownlee May 9, 2020 at 6:18 am #
  
  Well done!
  
  Yes and no – but at the numerical methods level.
  
  Yes, as in the implementations vary across machines because of differences in underlying libraries and eventually hardware. No as we are running the same general operations and minor rounding differences don’t matter much when averaged out.
  
  Reply
e2e4 May 10, 2020 at 9:02 pm #

After saving the trained model and reload it:
Epoch 1/10
– 56s – loss: 0.0732 – accuracy: 0.9729
Epoch 2/10
– 55s – loss: 0.0624 – accuracy: 0.9773
Epoch 3/10
– 55s – loss: 0.0565 – accuracy: 0.9792
Epoch 4/10
– 56s – loss: 0.0485 – accuracy: 0.9816
Epoch 5/10
– 55s – loss: 0.0435 – accuracy: 0.9844
Epoch 6/10
– 55s – loss: 0.0386 – accuracy: 0.9861
Epoch 7/10
– 56s – loss: 0.0357 – accuracy: 0.9877
Epoch 8/10
– 57s – loss: 0.0321 – accuracy: 0.9888
Epoch 9/10
– 57s – loss: 0.0307 – accuracy: 0.9892
Epoch 10/10
– 56s – loss: 0.0269 – accuracy: 0.9903
0.5345134609982372 0.9111999869346619

Why didn’t it improve on test data?

Reply
- Jason Brownlee May 11, 2020 at 5:58 am #
  
  Well done.
  
  What do you mean exactly?
  
  Reply
  - e2e4 May 14, 2020 at 2:08 am #
    
    1) I ran the exercise and received
    0.3429171951398253 0.9118000268936157
    2) Then I modified the code, ran it again and saved the trained model
    3) Modified the code again – reload model and run 10 times again
    
    Epoch 1/10
    – 56s – loss: 0.0732 – accuracy: 0.9729
    …
    Epoch 10/10
    – 56s – loss: 0.0269 – accuracy: 0.9903
    
    this is theresult achieved on train data.
    
    My question is
    why after evaluation of 2 times trained model on test data the loss and accuracy are the same as after first run? I would expect higher acc and lower loss.
    
    Thank you, Jason!
    
    Reply
    - Jason Brownlee May 14, 2020 at 5:55 am #
      
      Well done.
      
      I don’t understand your question, can you please rephrase it or elaborate?
      
      Reply
e2e4 May 10, 2020 at 9:05 pm #

Lesson 06
I varied flip and rotation angle. Also included zoom_range and brightness_range
#datagen = ImageDataGenerator(brightness_range=[0.2,1.0])
datagen = ImageDataGenerator(zoom_range=[0.9,1.9])

Reply
- Jason Brownlee May 11, 2020 at 5:58 am #
  
  Nice work.
  
  Reply
e2e4 May 17, 2020 at 6:59 pm #

Lesson 07
modified it for multiple facesas follows:

# get the context for drawing boxes
ax = pyplot.gca()
i=0
for i in range(len(faces)):
# get coordinates from the i face
x, y, width, height = faces[i][‘box’]
# create the shape
rect = Rectangle((x, y), width, height, fill=False, color=’red’)
# draw the box
ax.add_patch(rect)
i+=1
# show the plot
pyplot.show()

Thanks a lot for the course!! It’s very motivating to get results under your guidance, Jason!

Reply
- Jason Brownlee May 18, 2020 at 6:10 am #
  
  Well done on your progress!
  
  Reply
vkr May 18, 2020 at 2:45 pm #

day4: Image classification

Default result:
Doberman (33.59%)

following are different results with different images given
Samoyed (98.46%) —when an image of a dog is given

cocker_spaniel (25.23%)—set of 9 different dogs

Yorkshire_terrier (10.21%)–2 different dogs

Reply
- Jason Brownlee May 19, 2020 at 5:54 am #
  
  Well done!
  
  Reply
Ronke Babatunde May 23, 2020 at 6:41 am #

• Five impressive applications of deep learning methods in the field of computer vision

1. Image Classification
Classification is the process of predicting a specific class, or label, for something that is defined by a set of data points. Machine learning systems build predictive models that have enormous, yet often unseen benefits for people.

2. Object Detection
Object Detection is image classification with localization, but in pictures that may contain multiple objects. This is an active and important area of research because the computer vision systems that will be used in robotics and self-driving vehicles will be subjected to very complex images. Locating and identifying every object will undoubtedly be a critical part of their autonomy.

3. Image Reconstruction
Image Reconstruction is the task of recreating the missing or corrupt parts of an image.

4. Object Tracking
Object Tracking is one such example, where the goal is to keep track of a specific object in a sequence of images, or a video. Object tracking is important for virtually every computer vision system that contains multiple images. In self-driving cars, for example, pedestrians and other vehicles generally have to be avoided at a very high priority. Tracking objects as they move will not only help to avoid collisions through the use of split-second maneuvers, but also, the model can supply relevant information to other systems that will attempt to predict their next move.

5. Facial Recognition
Facial recognition is a common feature in today’s smartphones and cameras. Modern facial recognition systems at large enterprises are powered by deep learning networks and algorithms. Facebook’s DeepFace identifies human faces in digital images using a nine-layer neural network. The system has 97 percent accuracy, which is famously better than the FBI’s facial recognition system. Google also developed its own highly accurate facial recognition system named FaceNet.

An example application can be found in the article titled “Deep Learning for Computer Vision: A Brief Review”. https://doi.org/10.1155/2018/7068349

Reply
- Jason Brownlee May 23, 2020 at 6:42 am #
  
  Well done!
  
  Reply
  - Claudio Lombardi September 25, 2020 at 11:58 am #
    
    1. Human Pose Estimation
    The following are some of the applications of Human Pose Estimation
    
    Activity recognition for real-time sports analysis or surveillance system.
    For Augmented reality experiences
    In training Robots
    Animation and gaming
    
    2. Image Transformation Using GANs:
    When it’s about discussing the applications of Images generated using Gans, we have many. The following are some of its applications
    
    Image to image translation in style transfer and photo inpainting
    Image super-resolution
    Text to image generation
    Image editing
    Semantic image to photo translation
    
    3. Computer Vision for Developing Social Distancing Tools
    Computer vision technology can play a vital role in this crucial scenario. It can be used to track people in a premise or a particular area to know whether they are following social distancing norms or not.
    
    4. Creating a 3D Model From 2D Images
    Now you must be thinking about the use cases of this technology. The following are its applications
    
    Animation and Gaming
    Robotics
    Self-driving cars
    Medical Diagnosis and surgical operations
    
    5. Computer Vision in Healthcare: Medical Image Analysis
    Recent developments in computer vision technologies allow doctors to understand them better by converting into 3d interactive models and make their interpretation easy.
    
    Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation, by Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang
    
    Reply
    - Jason Brownlee September 25, 2020 at 2:47 pm #
      
      Nice work!
      
      Reply
Ronke Babatunde May 24, 2020 at 7:58 pm #

The code in lesson 2 has been run and the maximum and minimum pixel value of the blonde image before normalization is 255 and 0 respectively, while after normalization is 1 and 0. I was able to display the image in my python environment as well

Reply
- Jason Brownlee May 25, 2020 at 5:46 am #
  
  Well done!
  
  Reply
Ronke Babatunde May 24, 2020 at 9:54 pm #

Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0
The shape of the image has changed from 256, 256 to 127, 127 as output from the pooling layer
I varied using one conv layer with 64 filters and maxpooling value 1. I got the output below
Total params: 4,129,665
Trainable params: 4,129,665
Non-trainable params: 0

I varied using one conv layer with 64 filters and maxpooling value 2, and image size 512 x512, I got the output below
Total params: 16,267,457
Trainable params: 16,267,457
Non-trainable params: 0

However, i need more explanation on the interpretation of the results please.

Reply
- Jason Brownlee May 25, 2020 at 5:52 am #
  
  Well done!
  
  Perhaps this will help:
  https://machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/
  
  Reply
Mayank goyal May 26, 2020 at 2:20 pm #

Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 127, 127, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 516128) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0

Reply
- Jason Brownlee May 27, 2020 at 7:41 am #
  
  Well done!
  
  Reply
Mayank goyal May 26, 2020 at 2:21 pm #

The code in lesson 2 has been run and the maximum and minimum pixel value of the blonde image before normalization is 255 and 0 respectively, while after normalization is 1 and 0. I was able to display the image in my python environment as well

Reply
- Jason Brownlee May 27, 2020 at 7:42 am #
  
  Nice work!
  
  Reply
Mayank goyal May 26, 2020 at 2:28 pm #

Model: “sequential_6”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_10 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_11 (Conv2D) (None, 252, 252, 64) 18496
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 126, 126, 64) 0
_________________________________________________________________
conv2d_12 (Conv2D) (None, 124, 124, 128) 73856
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 62, 62, 128) 0
_________________________________________________________________
flatten_4 (Flatten) (None, 492032) 0
_________________________________________________________________
dense_4 (Dense) (None, 1) 492033
=================================================================
Total params: 584,705
Trainable params: 584,705
Non-trainable params: 0
_________________________________________________________________

Reply
- Jason Brownlee May 27, 2020 at 7:42 am #
  
  Great progress!
  
  Reply
Mayank goyal May 26, 2020 at 2:59 pm #

Lesson: 4

Doberman (33.59%)
Egyptian_cat (32.42%)
Great_Dane (47.91%)

Reply
- Jason Brownlee May 27, 2020 at 7:42 am #
  
  Nice.
  
  Reply
Mayank goyal May 27, 2020 at 3:23 pm #

lesson 6:

datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)

Reply
- Jason Brownlee May 28, 2020 at 6:09 am #
  
  Nice work!
  
  Reply
Ronke Babatunde May 28, 2020 at 8:46 am #

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5
553467904/553467096 [==============================] – 1401s 3us/step

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
40960/35363 [==================================] – 1s 22us/step
Doberman (33.59%)

Reply
- Jason Brownlee May 28, 2020 at 1:23 pm #
  
  Nice work!
  
  Reply
Ronke Babatunde May 29, 2020 at 11:53 pm #

I got this – Doberman (33.59%) when i ran the code

I got this – cowboy_hat (10.05%) when i loaded a human image

Reply
- Jason Brownlee May 30, 2020 at 6:04 am #
  
  Well done!
  
  Reply
Ronke Babatunde May 30, 2020 at 12:23 am #

Day 5 task: this is the result i got, training the CNN took a little longer time though

Epoch 1/10
– 49s – loss: 0.3850 – accuracy: 0.8631
Epoch 2/10
– 44s – loss: 0.2564 – accuracy: 0.9057
Epoch 3/10
– 43s – loss: 0.2119 – accuracy: 0.9212
Epoch 4/10
– 43s – loss: 0.1804 – accuracy: 0.9326
Epoch 5/10
– 42s – loss: 0.1546 – accuracy: 0.9432
Epoch 6/10
– 42s – loss: 0.1321 – accuracy: 0.9505
Epoch 7/10
– 41s – loss: 0.1139 – accuracy: 0.9575
Epoch 8/10
– 41s – loss: 0.0991 – accuracy: 0.9635
Epoch 9/10
– 41s – loss: 0.0872 – accuracy: 0.9676
Epoch 10/10
– 41s – loss: 0.0716 – accuracy: 0.9731
0.3339003710135818 0.9138000011444092

Reply
- Jason Brownlee May 30, 2020 at 6:04 am #
  
  Nice work!
  
  Reply
Ronke Babatunde June 2, 2020 at 2:02 am #

Day 6 task. This is the result i got

Can you kindly give further explanation on interpreting the above result, since we are performing data augmentation. Thanks

Reply
- Jason Brownlee June 2, 2020 at 6:19 am #
  
  Yes, see this:
  https://machinelearningmastery.com/how-to-configure-image-data-augmentation-when-training-deep-learning-neural-networks/
  
  Reply
Ronke Babatunde June 2, 2020 at 2:26 am #

Day 7 task. After running the above code,

is all i got, the face detected could not show, nothing was displayed, Kindly guide. Thanks

Reply
- Jason Brownlee June 2, 2020 at 6:20 am #
  
  Sorry to hear that, perhaps this will help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Swati June 12, 2020 at 10:44 pm #

Day1 Task:Applications of Deep Learning in the field of computer vision
1.augmented reality
2. virtual reality
3. autonomous vehicle
4. Navigation System for Visually impaired
5.Optic Disc from retina images

Reply
- Jason Brownlee June 13, 2020 at 6:03 am #
  
  Nice work!
  
  Reply
Priyanshi burad July 15, 2020 at 5:10 am #

DAY 2 : PREPARING IMAGE DATASET
Before Normalization
Min: 0.000, Max: 255.000

After Normalization
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee July 15, 2020 at 8:30 am #
  
  Well done!
  
  Reply
Martin July 19, 2020 at 9:48 pm #

Day 1 – Applications of deep learning methods in the field of computer vision
1. stores are presently utilizing facial recognition innovation to give a smoother payment experience to customers (at the cost of their security, however). Rather than utilizing credit cards or mobile payment apps, clients just need to demonstrate their face to a computer vision-equipped camera.
2. iPhone X introduced FaceID, a validation framework that utilizes an on-device neural network to open the telephone when it sees its owner’s face. During setup, FaceID trains its AI model on the face of the owner and works modestly under various lighting conditions, facial hair, hair styles, caps, and glasses.
3. Diabetic Foot Ulcers (DFU) that affect the lower extremities are a major complication of diabetes. Each year, more than 1 million diabetic patients undergo amputation due to failure to recognize DFU and get the proper treatment from clinicians. There is an urgent need to use a CAD system for the detection of DFU. The paper, proposes using deep learning methods (EfficientDet Architectures) for the detection of DFU- “Goyal, Manu. (2020). A Refined Deep Learning Architecture for Diabetic Foot Ulcers Detection.”
4. In deep end-to-end learning based autonomous car design, inferencing the signal by trained model is one of the critical issues, particularly, in case of embedded component. Researchers from both academia and industry have been putting their enormous efforts in making this critical autonomous driving more reliable and safer. As research on the real car is costly and poses safety issue, we have developed a small scale, low-cost, deep convolutional neural network powered self-driving car model. Its learning model adopted from NVIDIA’s DAVE-2 which is a real autonomous car and Kansas University’s small scale DeepPicar. Similar to DAVE-2, its neural architecture uses 5 convolution layer and 3 fully connected layers with 250,000 parameters. We have considered Raspberry Pi 3B+ as the processing platform with Quad-core 1.4 GHz CPU based on A53 architecture which is capable to support CNN learning model. – “Goyal, Manu & Yap, Moi Hoon & Hassanpour, Saeed. (2020). Multi-class Semantic Segmentation of Skin Lesions via Fully Convolutional Networks. 290-295. 10.5220/0009380302900295. ”
5. Image reconstruction which involves filling in missing portions of an image or correcting corrupted parts of an image. Much like image colorization, image reconstruction can be seen as a filter that is applied to the image.

Reply
- Jason Brownlee July 20, 2020 at 6:12 am #
  
  Well done, this is great work!
  
  Reply
Martin July 20, 2020 at 12:41 am #

Day 2 : Image Preparation

—–Before Normalisation—–
Data Type: uint8
Min: 0.000, Max: 255.000

—–After Normalisation—–
Data Type: uint8
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee July 20, 2020 at 6:15 am #
  
  Nice work!
  
  Reply
Sukanya G July 25, 2020 at 1:59 am #

Day 2:Image Preparation
————————————

Data Type: uint8

Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee July 25, 2020 at 6:23 am #
  
  Well done.
  
  Reply
Sukanya G July 25, 2020 at 2:01 am #

Day 3:Creation of CNN
———————————
Using TensorFlow backend.
Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 127, 127, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 516128) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0

Reply
- Jason Brownlee July 25, 2020 at 6:23 am #
  
  Great work!
  
  Reply
Sukanya G July 25, 2020 at 2:04 am #

Day 4 Image Classification
———————————–
Doberman (33.59%)

Reply
- Jason Brownlee July 25, 2020 at 6:23 am #
  
  Nice!
  
  Reply
Sukanya G July 25, 2020 at 11:50 pm #

Day 5 Train image Classification model
_________________________________
Downloading data from http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
32768/29515 [=================================] – 0s 3us/step
Downloading data from http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] – 2s 0us/step
Downloading data from http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
8192/5148 [===============================================] – 0s 0us/step
Downloading data from http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] – 1s 0us/step
Epoch 1/10
– 35s – loss: 0.3756 – accuracy: 0.8656
Epoch 2/10
– 34s – loss: 0.2463 – accuracy: 0.9099
Epoch 3/10
– 34s – loss: 0.2030 – accuracy: 0.9254
Epoch 4/10
– 34s – loss: 0.1680 – accuracy: 0.9382
Epoch 5/10
– 34s – loss: 0.1433 – accuracy: 0.9462
Epoch 6/10
– 34s – loss: 0.1201 – accuracy: 0.9549
Epoch 7/10
– 34s – loss: 0.0998 – accuracy: 0.9630
Epoch 8/10
– 34s – loss: 0.0843 – accuracy: 0.9696
Epoch 9/10
– 34s – loss: 0.0685 – accuracy: 0.9744
Epoch 10/10
– 34s – loss: 0.0589 – accuracy: 0.9778
0.3513805921599269 0.9124000072479248

Reply
- Jason Brownlee July 26, 2020 at 6:19 am #
  
  Well done!
  
  Reply
goona faramarzi August 13, 2020 at 5:54 pm #

hello.thanks for your good explanation. I have two questions.
first:In rotation =90 in generator it means can rotate image between [-90,90] but I want to ratate exactly90.what should i do?
second: if we want to rotate 90,360 in generator what should i do?

Reply
- Jason Brownlee August 14, 2020 at 5:59 am #
  
  Good question, perhaps use a custom generator to control the augmentation.
  
  Reply
Yuhua August 18, 2020 at 9:02 pm #

5 applications of CV for DL: Image Classification, Object Detection, Image Reconstruction, Object Tracking, information retieval

Reply
- Jason Brownlee August 19, 2020 at 5:59 am #
  
  Nice work!
  
  Reply
Ryan September 17, 2020 at 2:22 am #

Lesson 04: Image Classification

After running this code –> model = VGG16( ), I am getting the following error:

ResourceExhaustedError: OOM when allocating tensor with shape[3,3,64,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Mul] name: block2_conv1_5/random_uniform/mul/

How do I fix this?

Reply
- Jason Brownlee September 17, 2020 at 6:50 am #
  
  It looks like you are out of memory.
  
  Perhaps try to run on an AWS EC2 instance with more memory?
  https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/
  
  Reply
Jason September 28, 2020 at 12:28 am #

I tried 7-day to sign-up for this free course. but I enter the email address and hit “Download Now” the link does not seem to be working was it only available for limited amount of time thanks

Reply
- Jason Brownlee September 28, 2020 at 6:20 am #
  
  Sorry to hear that you’re having trouble, contact me directly and I will send you the PDF:
  https://machinelearningmastery.com/contact/
  
  Reply
Moushumi Biswas October 29, 2020 at 5:08 pm #

Lesson 02

Data Type: uint8
Min: 0.000, Max: 255.000
Min/: 0.000, Max: 1.000

Reply
- Jason Brownlee October 30, 2020 at 6:49 am #
  
  Great work!
  
  Reply
khin san myint November 24, 2020 at 8:02 pm #

lesson 4 Image Classification

Doberman (33.59%)

thank you

Reply
- Jason Brownlee November 25, 2020 at 6:42 am #
  
  Well done!
  
  Reply
khin san myint November 24, 2020 at 8:07 pm #

Day 3 :

Default Version
Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
conv2d_1 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
conv2d_2 (Conv2D) (None, 250, 250, 32) 9248
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 125, 125, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 123, 123, 32) 9248
_________________________________________________________________
conv2d_4 (Conv2D) (None, 121, 121, 32) 9248
_________________________________________________________________
flatten (Flatten) (None, 468512) 0
_________________________________________________________________
dense (Dense) (None, 1) 468513
=================================================================
Total params: 505,825
Trainable params: 505,825
Non-trainable params: 0

Reply
- Jason Brownlee November 25, 2020 at 6:42 am #
  
  Great work!
  
  Reply
khin san myint November 24, 2020 at 8:27 pm #

lesson 5

Default Version
Epoch 1/10
1875/1875 – 23s – loss: 0.3870 – accuracy: 0.8618
Epoch 2/10
1875/1875 – 27s – loss: 0.2574 – accuracy: 0.9064
Epoch 3/10
1875/1875 – 25s – loss: 0.2147 – accuracy: 0.9211
Epoch 4/10
1875/1875 – 28s – loss: 0.1842 – accuracy: 0.9325
Epoch 5/10
1875/1875 – 28s – loss: 0.1605 – accuracy: 0.9408
Epoch 6/10
1875/1875 – 29s – loss: 0.1381 – accuracy: 0.9488
Epoch 7/10
1875/1875 – 21s – loss: 0.1194 – accuracy: 0.9572
Epoch 8/10
1875/1875 – 28s – loss: 0.1019 – accuracy: 0.9616
Epoch 9/10
1875/1875 – 28s – loss: 0.0889 – accuracy: 0.9676
Epoch 10/10
1875/1875 – 29s – loss: 0.0773 – accuracy: 0.9717
0.3053729832172394 0.9150000214576721

Reply
- Jason Brownlee November 25, 2020 at 6:42 am #
  
  Excellent.
  
  Reply
khin san myint November 26, 2020 at 6:26 pm #

Image augmentation

Testing with additional data

ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rotation_range=45, fill_mode=’nearest’, rescale=1.5)

ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rotation_range=45, fill_mode=’nearest’, rescale=0.5)

Reply
- Jason Brownlee November 27, 2020 at 6:35 am #
  
  Nice work!
  
  Reply
khin san myint November 27, 2020 at 3:17 pm #

i have finished face detection.
i have tested with many faces image.
in this, only one person is detected. What about?

Reply
- Jason Brownlee November 28, 2020 at 6:35 am #
  
  Well done.
  
  Try other faces.
  
  Reply
Azerul Azlan December 5, 2020 at 4:34 pm #

Lesson1
5 Impressive application deep learning method:
1) Self-driving : Companies building these types of driver-assistance services, as well as full-blown self-driving cars like Google’s, need to teach a computer how to take over key parts (or all) of driving using digital sensor systems instead of a human’s senses. To do that companies generally start out by training algorithms using a large amount of data.

2) Voice Search & Voice-Activated Assistants: One of the most popular usage areas of deep learning is voice search & voice-activated intelligent assistants. With the big tech giants have already made significant investments in this area, voice-activated assistants can be found on nearly every smartphone. Apple’s Siri is on the market since October 2011. Google Now, the voice-activated assistant for Android, was launched less than a year after Siri. The newest of the voice-activated intelligent assistants is Microsoft Cortana.

3) Automatic Machine Translation: Automatic machine translation has been around for a long time, but deep learning is achieving top results in two specific areas:
-Automatic Translation of Text
-Automatic Translation of Images
Text translation can be performed without any pre-processing of the sequence, allowing the algorithm to learn the dependencies between words and their mapping to a new language.

4) Image Recognition: It aims to recognize and identify people and objects in images as well as to understand the content and context. Image recognition is already being used in several sectors like gaming, social media, retail, tourism, etc.
This task requires the classification of objects within a photograph as one of a set of previously known objects. A more complex variation of this task called object detection involves specifically identifying one or more objects within the scene of the photograph and drawing a box around them.

5) Automatic Image Caption Generation: Automatic image captioning is the task where given an image the system must generate a caption that describes the contents of the image.

Reply
- Jason Brownlee December 6, 2020 at 6:58 am #
  
  Well done!
  
  Reply
Azerul Azlan December 8, 2020 at 12:17 am #

Lesson 2: Preparing Image Data

The result for the image given
Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee December 8, 2020 at 7:44 am #
  
  Nice work.
  
  Reply
Gautam Pradhan December 8, 2020 at 1:58 am #

Lesson 1.

1) Corn plant counting using deep learning and UAV images. DOI: 0.1109/LGRS.2019.2930549
2) Deep Learning to count coconut plants.
3) To count number of semi, car, minibus passing through an intersection.
4) Detecting forehead temperature of moving people through airport checking.
5) Detection and analysis of wheat spikes using convolutional neural networks. DOI
https://doi.org/10.1186/s13007-018-0366-8

Reply
- Jason Brownlee December 8, 2020 at 7:45 am #
  
  Nice work!
  
  Reply
Gautam December 8, 2020 at 2:17 am #

Min and Max Values:

Before Normalization: Min: 0.000, Max: 255.000

After Normalization: Min: 0.000, Max: 1.000

Reply
- Jason Brownlee December 8, 2020 at 7:46 am #
  
  Well done.
  
  Reply
Azerul Azlan December 9, 2020 at 2:56 pm #

Lesson 3: CNN
Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 516128) 0
_________________________________________________________________
dense (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0
_________________________________________________________________

Reply
- Jason Brownlee December 10, 2020 at 6:17 am #
  
  Well done.
  
  Reply
Azerul Azlan December 9, 2020 at 3:06 pm #

Lesson 4: Image Classifier

The results is: Doberman (33.59%)

I try for cat image based on the code given and the results is: tiger_cat (30.62%)

Reply
- Jason Brownlee December 10, 2020 at 6:18 am #
  
  Well done!
  
  Reply
Moussa ABOUBAKAR December 29, 2020 at 12:07 am #

Lesson 01: Deep Learning and Computer Vision

List of five applications of deep learning methods in the field of computer vision.

1- 3D Object Retrieval and Recognition (https://dl.acm.org/doi/pdf/10.1145/3042064?casa_token=dYS5kQ5Q4gQAAAAA:R-_J0uUVm7oPLZY6kp9nV-8LXcI0gkR3HaXVSFzrVLl–CBG1_Rdwvs_HGgIuY5FPlXAX7kSaRw)
2- Plant Phenotyping with Limited Labeled Data (https://arxiv.org/pdf/2006.11391.pdf)

3- Real time object detection (https://paperswithcode.com/paper/faster-r-cnn-towards-real-time-object)

4- Image reconstruction (https://machinelearningmastery.com/applications-of-deep-learning-for-computer-vision/)

5- Pedestrian detection (https://www.sciencedirect.com/science/article/pii/S092523121830290X?casa_token=X00I0OMZ898AAAAA:xPMD-oIEIJdO4lwkkaoGzobjLjH73x9KZd8D498ej-x9oNSfMZ8Qaqp8djBFmIVqyu3soqKf#sec0012)

Reply
- Jason Brownlee December 29, 2020 at 5:14 am #
  
  Nice work.
  
  Reply
Moussa ABOUBAKAR December 29, 2020 at 3:32 am #

Lesson 04: Image Classification

I got this result after running the example: Doberman (33.59%)

I tried the example with an image of a car and I get this result: minibus (14.77%)

Reply
- Jason Brownlee December 29, 2020 at 5:18 am #
  
  Nice work.
  
  The model is not perfect.
  
  Reply
Mitchell December 31, 2020 at 2:51 am #

Day4:

run VGG16 model, it shows the dog is a doberman.

lable [[(‘n02107142’, ‘Doberman’, 0.3359479), (‘n02105412’, ‘kelpie’, 0.21615942), (‘n02106550’, ‘Rottweiler’, 0.1769872), (‘n02089078’, ‘black-and-tan_coonhound’, 0.12776804), (‘n02107312’, ‘miniature_pinscher’, 0.03730356)]]
Doberman (33.59%)

When I changed to VGG19, the classification predicts it is a Kelpie

lable [[(‘n02105412’, ‘kelpie’, 0.35011458), (‘n02107142’, ‘Doberman’, 0.2983739), (‘n02106550’, ‘Rottweiler’, 0.22378054), (‘n02089078’, ‘black-and-tan_coonhound’, 0.04829501), (‘n02099712’, ‘Labrador_retriever’, 0.008893081)]]

kelpie (35.01%)

In both models, Kelpie and Doberman tops the probability on the lists. Are they using the same training data? Why is the output so different?

Reply
- Jason Brownlee December 31, 2020 at 5:30 am #
  
  Nice work!
  
  Specifically, no idea. Generally, different models have different capabilities.
  
  Reply

Mitchell January 1, 2021 at 5:24 am #

Thank you for the class!

Day7. I modified the code to support face detection on multiple faces. The change is to add a ‘for loop’ for each face.

# face detection with mtcnn on a photograph
from matplotlib import pyplot
from matplotlib.patches import Rectangle
from mtcnn.mtcnn import MTCNN

import sys

# load image from file
#pixels = pyplot.imread('street.jpg')

pixels = pyplot.imread(sys.argv[1])
# create the detector, using default weights
detector = MTCNN()
# detect faces in the image
faces = detector.detect_faces(pixels)
# plot the image
pyplot.imshow(pixels)
# get the context for drawing boxes
ax = pyplot.gca()

for face in faces:
# get coordinates from the first face
    x, y, width, height = face['box']
# create the shape
    rect = Rectangle((x, y), width, height, fill=False, color='red')
# draw the box
    ax.add_patch(rect)
# show the plot
pyplot.show()

# face detection with mtcnn on a photograph

from matplotlib import pyplot

from matplotlib.patches import Rectangle

from mtcnn.mtcnn import MTCNN

import sys

# load image from file

#pixels = pyplot.imread('street.jpg')

pixels = pyplot.imread(sys.argv[1])

# create the detector, using default weights

detector = MTCNN()

# detect faces in the image

faces = detector.detect_faces(pixels)

# plot the image

pyplot.imshow(pixels)

# get the context for drawing boxes

ax = pyplot.gca()

for face in faces:

# get coordinates from the first face

x, y, width, height = face['box']

# create the shape

rect = Rectangle((x, y), width, height, fill=False, color='red')

# draw the box

ax.add_patch(rect)

# show the plot

pyplot.show()

Suppose you save the code as mtCNN.py, you can run

Command line> python mtcnn.py picture.jpg

There should be 11 faces in this picture

https://www.google.com/search?q=people+on+street&sxsrf=ALeKk00zWelGEzoVT7Mtt3GORoIZcLke-w:1609437597048&tbm=isch&source=iu&ictx=1&fir=oD9-FR9LLlTuNM%252CM3i4Lf8ga0sePM%252C_&vet=1&usg=AI4_-kQlCK-9MOCohapSoZ_XMOAogDvHbQ&sa=X&ved=2ahUKEwj184mi5vjtAhURWqwKHTuRCEQQ9QF6BAgFEAE&biw=1212&bih=569#imgrc=oD9-FR9LLlTuNM

Jason Brownlee January 1, 2021 at 5:35 am #

Nicely done, thanks for sharing!

Reply

Tarun January 4, 2021 at 5:45 pm #

Lesson 01: Five impressive computer vision apps from my perspective is :
1. Detecting disease in human, plants, animals etc.
2. Face detection and recognition
3. Face Landmarks detection
4. Pose detection
5. Wardrobe selection using AI/AR/VR
6. Face Mask detection
7. Social distancing calculation
…and many more.

Reply
- Jason Brownlee January 5, 2021 at 6:16 am #
  
  Well done!
  
  Reply
Viraj Mehta January 6, 2021 at 9:46 pm #

5 applications of deep learning in the field of computer vision are as follows:

1) Image Classification
2) Object Detection
3) Image Reconstruction
4) Image Classification with Localization
5) Style Transfer

Reply
- Jason Brownlee January 7, 2021 at 6:17 am #
  
  Nice work!
  
  Reply
Viraj Mehta January 7, 2021 at 5:10 pm #

Findings of task 2:

Min : 0.000
Max: 1.000

Reply
- Jason Brownlee January 8, 2021 at 5:38 am #
  
  Nice work!
  
  Reply
Samira January 8, 2021 at 8:06 pm #

Lesson 01: Paper for impressive applications of deep learning methods for Computer vision

https://www.hindawi.com/journals/cin/2018/7068349/

Reply
- Jason Brownlee January 9, 2021 at 6:41 am #
  
  Nice work.
  
  Reply
Sudarshan January 19, 2021 at 3:42 pm #

lesson 1:
1.Fruit Quality Evaluation using Machine Learning
2.Faulty PCB detector using Machine Learning

Reply
- Jason Brownlee January 20, 2021 at 5:38 am #
  
  Well done!
  
  Reply
Deepa January 21, 2021 at 4:42 am #

Agriculture -Potato classification
Remote sensing – Soil classification
Paint Quality- Car assembly ,detection of paint issues
Natural disaster recovery – flood risk assessment
Sports, cricket – Umpire decision review system

Reply
- Jason Brownlee January 21, 2021 at 6:50 am #
  
  Nice work!
  
  Reply
Deepa January 21, 2021 at 5:20 pm #

Lesson2: Preparing Image data
Before Normalization
min – 0.000 , max – 255.000
After Normalization
min-0.000 max -1.000

Reply
- Jason Brownlee January 22, 2021 at 7:18 am #
  
  Well done!
  
  Reply
Deepa January 23, 2021 at 2:58 pm #

Lesson3:Convolutional Neural Network

used tensorflow.keras to import method
Model: “sequential_1”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 516128) 0
_________________________________________________________________
dense (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0

Reply
- Jason Brownlee January 24, 2021 at 5:53 am #
  
  Well done!
  
  Reply
Sudarshan Bhagwan Bhalkar January 25, 2021 at 3:39 pm #

Lesson4:

output — Doberman (33.59%)

Reply
- Jason Brownlee January 26, 2021 at 5:47 am #
  
  Well done!
  
  Reply
sudarshan January 25, 2021 at 4:24 pm #

I am done with all tasks ,I really enjoyed this course.
thanks Jason Brownlee for this amazing experience.

Reply
- Jason Brownlee January 26, 2021 at 5:47 am #
  
  Thanks, great work on your progress!
  
  Reply
Karthi January 26, 2021 at 12:21 am #

%s uint8
819840
Before Normalization
Min :0.000 Max:255.000 Mean :155.53 Standard Dewviation 51.79
After Normalization
Min :0.000 Max:1.000 Mean :0.61 Standard Dewviation 0.20

Reply
- Jason Brownlee January 26, 2021 at 5:57 am #
  
  Well done!
  
  Reply
Deepa January 28, 2021 at 4:28 am #

Lesson 4 : Image Classification

Due to proxy issue , have downloaded and saved the imagenet_class_index.json in local

wanted to use this for decode_predictions

How to give this local json file as input in decode _predictions ?

Reply
- Jason Brownlee January 28, 2021 at 6:08 am #
  
  Sorry, I don’t know about the json file.
  
  Reply
  - Deepa January 29, 2021 at 1:00 am #
    
    It is fine. I downloaded the json file and placed in folder ~/keras/model and decode_predictions worked
    
    Reply
    - Jason Brownlee January 29, 2021 at 6:06 am #
      
      Well done.
      
      Reply
Deepa January 29, 2021 at 12:58 am #

Lesson 4: Image Classification
Doberman (33.59%)

Reply
- Deepa January 29, 2021 at 1:02 am #
  
  Please guide to interpret the output of image classification
  
  Reply
  - Jason Brownlee January 29, 2021 at 6:06 am #
    
    What do you mean exactly? Can you please elaborate?
    
    Reply
- Jason Brownlee January 29, 2021 at 6:06 am #
  
  Well done.
  
  Reply
Deepa February 1, 2021 at 11:34 pm #

Lesson 5: Train Image Classification Model
Train on 60000 samples
Epoch 1/10
60000/60000 – 68s – loss: 0.3970 – accuracy: 0.8601
Epoch 2/10
60000/60000 – 73s – loss: 0.2660 – accuracy: 0.9028
Epoch 3/10
60000/60000 – 71s – loss: 0.2177 – accuracy: 0.9199
Epoch 4/10
60000/60000 – 70s – loss: 0.1869 – accuracy: 0.9301
Epoch 5/10
60000/60000 – 80s – loss: 0.1605 – accuracy: 0.9406
Epoch 6/10
60000/60000 – 80s – loss: 0.1376 – accuracy: 0.9493
Epoch 7/10
60000/60000 – 108s – loss: 0.1179 – accuracy: 0.9562
Epoch 8/10
60000/60000 – 88s – loss: 0.0995 – accuracy: 0.9631
Epoch 9/10
60000/60000 – 89s – loss: 0.0855 – accuracy: 0.9688
Epoch 10/10
60000/60000 – 72s – loss: 0.0726 – accuracy: 0.9737

LOSS ACCURACY
0.33886396311819555 0.9118

Challenge & Learning
——————————–
To download data from googleapi was restricted due to environment settings

It was fixed by inclusion of the following lines in the file ‘__init__.py’ before loading data

You can find this file in the folder ~keras/Datasets/fashion_mnist/

XXXX – depends upon the user environment

import os
os.environ[‘NO_PROXY’] = ‘http://XXXX’
os.environ[‘PROXY’] = ‘http://XXXX’
os.environ[‘HTTPS_PROXY’] = ‘http://XXXX’
os.environ[‘ALL_PROXY’] = ‘http://XXXX’

Reply
- Jason Brownlee February 2, 2021 at 5:45 am #
  
  Excellent work!
  
  Reply
deepa February 6, 2021 at 7:27 pm #

Lesson 6: Image augmentation

Observation- object in the image remains the same.
original image is rotated /shifted to different directions. Rotation is by 90 degree
It shows that image was captured by the photographer in different angles

Reply
- Jason Brownlee February 7, 2021 at 5:17 am #
  
  Well done!
  
  Reply
Kuldeep March 9, 2021 at 1:45 am #

Some applications of deep learning methods in the field of computer vision:
1. Image classification
2. Facial recognition applications
3. Item and logistic classification
4. Computer Vision in Healthcare: Medical Image Analysis
5. Creating a 3D Model From 2D Images
6. Computer Vision for Developing Social Distancing Tools

Reply
- Jason Brownlee March 9, 2021 at 5:22 am #
  
  Well done!
  
  Reply
Kuldeep March 9, 2021 at 2:10 am #

Day 2: Preparing Image Data:
Result:

(base) C:\Users\226399\Kerasprojects>python imagedata.py
Data Type: uint8
Pixel range before Normalization
Min: 0.000, Max: 255.000
Pixel range after Normalization
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee March 9, 2021 at 5:22 am #
  
  Excellent!
  
  Reply
Kuldeep March 9, 2021 at 3:17 am #

Day 2: Preparing Image Data:
updated the example to standardize the pixel values. Result as follows

(base) C:\Users\226399\Kerasprojects>python imagedata.py
Data Type: uint8
Pixel range before Normalization
Min: 0.000, Max: 255.000
Pixel range after Normalization
Min: 0.000, Max: 1.000
Pixel mean is 155.55 and Pixel Std dev is 51.437077
Standardized Pixel mean is 0.061361298 and Standardized Pixel Std dev is 1.0245248

I did not get mean=0 and std dev =1 after standardization.

$$$$$$$$$$$$$$$ My code is as follows: $$$$$$$$$$$$$$$$$$$

# example of pixel normalization
from numpy import asarray
from PIL import Image
# load image
image = Image.open(‘bondi_beach.jpg’)
pixels = asarray(image)
# confirm pixel range is 0-255
print(‘Data Type: %s’ % pixels.dtype)
print(“Pixel range before Normalization”);
print(‘Min: %.3f, Max: %.3f’ % (pixels.min(), pixels.max()))
# convert from integers to floats
pixels = pixels.astype(‘float32’)
# normalize to the range 0-1
npixels = pixels /255.0
# confirm the normalization
print(“Pixel range after Normalization”);
print(‘Min: %.3f, Max: %.3f’ % (npixels.min(), npixels.max()))
#Standardize the pixels
#calculate dataset mean and std
pixel_mean=pixels.mean()
pixel_std=pixels.std()
print(“Pixel mean is %s and Pixel Stdev is %s”%(pixel_mean,pixel_std))
#calculate Z score
for i in pixels:
spixel=(i-pixel_mean)/pixel_std
print(“Standardized Pixel mean is %s and Standardized Pixel Std dev is %s” %(spixel.mean(),spixel.std()))
$$$$$$$$$$$$$$$$$$$$$$$$ CODE ENDS $$$$$$$$$$$$$$$$$$$$$$$$$$

Reply
- Jason Brownlee March 9, 2021 at 5:23 am #
  
  Well done!
  
  Reply
dingowhiz March 12, 2021 at 7:44 pm #

Lesson 4: Image Classification
I had a kelpie with goggle and it gave me ‘llama (85.72%)’

Reply
- Jason Brownlee March 13, 2021 at 5:28 am #
  
  Interesting!
  
  Reply
dingowhiz March 12, 2021 at 7:49 pm #

Day 5: Train Image Classification Model
>print(loss, acc)

0.3586573600769043 0.907800018787384

Reply
- Jason Brownlee March 13, 2021 at 5:28 am #
  
  Well done!
  
  Reply
Nafy Aidara March 23, 2021 at 7:10 am #

Five applications of deep learning in Computer vision:
1. Image classification/ recognition
2. Object Detection
3. Image reconstruction
4. Object Segmentation
5. Image colorization
One research paper that illustrate this is: Deep residual learning for image recognition written by Kaiming He, Xiangyu Zhang, shaoping Ren and Jian Sun.

Reply
- Jason Brownlee March 24, 2021 at 5:45 am #
  
  Well done!
  
  Reply
Nafy Aidara March 23, 2021 at 8:30 am #

Lesson 2: Preparing image data
1. The first thing i did is to display the properties of the loaded image. and I obtain the following
JPEG
RGB
(640, 427)
2.confirming the image pixels
Data Type: uint8
Min: 0.000, Max: 255.000
3. Normalize the data
Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000, Max: 1.000
4. Global Standardize
Mean: 155.544, Standard Deviation: 51.411
Mean: 0.539, Standard Deviation: 0.377
Min: 0.000, Max: 1.000
5. Global Centering
Mean: 155.544
Min: 0.000, Max: 255.000
Mean: -0.000
Min: -155.544, Max: 99.456

Reply
- Jason Brownlee March 24, 2021 at 5:45 am #
  
  Great work!
  
  Reply

KC PARK March 25, 2021 at 3:40 pm #

Lesson 06: Image Augmentation

Image data augmentation is a useful technique.
I made the program to show the result on the browser using streamlit framework.

The streamlit is in the link below.
https://streamlit.io/

========================================================================

import streamlit as st
from PIL import Image
from matplotlib import pyplot
from numpy import expand_dims
from tensorflow.keras.preprocessing.image import img_to_array, ImageDataGenerator


MAX_WIDTH_IMG = 320
MAX_HEIGHT_IMG = 240


def augmentate_image(uploaded_file, file_names):
    if uploaded_file is not None:
        # create image data augmentation generator
        datagen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True,
                                     rotation_range=45)

        st.write('Uploaded image')
        st.image(uploaded_file, caption=file_names)

        # convert to numpy array
        data = img_to_array(uploaded_file)

        # expand dimension to one sample
        samples = expand_dims(data, 0)

        # prepare iterator
        it = datagen.flow(samples, batch_size=1)
        py_fig, py_ax = pyplot.subplots(3, 3)

        for i in range(9):
                # define subplot
                pyplot.subplot(330 + 1 + i)

                # generate batch to images
                batch = it.next()

                # convert to unsigned integers for viewing
                image = batch[0].astype('uint8')

                # plot raw pixel data
                pyplot.imshow(image)

            # show the figure
        st.pyplot(py_fig)

        return True
    else:
        return False


def upload_images():
    uploaded_file = st.file_uploader("Please select an image", type=["png", "jpg", "jpeg"], accept_multiple_files=False)
    if uploaded_file is not None:
        target_size = (MAX_WIDTH_IMG, MAX_HEIGHT_IMG)

        try:
            uploaded_image = Image.open(uploaded_file)
        except BaseException:
            raise
        else:
            uploaded_image = uploaded_image.resize(target_size)
        return uploaded_image, uploaded_file.name
    else:
        return None, None


def main():
    st.title('Image Augmentation')

    uploaded_image, image_name = upload_images()
    if uploaded_image is not None:
        augmentate_image(uploaded_image, image_name)
    else:
        st.write('Wait to upload a file.')


if __name__ == '__main__':
    main()

import streamlit as st

from PIL import Image

from matplotlib import pyplot

from numpy import expand_dims

from tensorflow.keras.preprocessing.image import img_to_array, ImageDataGenerator

MAX_WIDTH_IMG = 320

MAX_HEIGHT_IMG = 240

def augmentate_image(uploaded_file, file_names):

if uploaded_file is not None:

# create image data augmentation generator

datagen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True,

rotation_range=45)

st.write('Uploaded image')

st.image(uploaded_file, caption=file_names)

# convert to numpy array

data = img_to_array(uploaded_file)

# expand dimension to one sample

samples = expand_dims(data, 0)

# prepare iterator

it = datagen.flow(samples, batch_size=1)

py_fig, py_ax = pyplot.subplots(3, 3)

for i in range(9):

# define subplot

pyplot.subplot(330 + 1 + i)

# generate batch to images

batch = it.next()

# convert to unsigned integers for viewing

image = batch[0].astype('uint8')

# plot raw pixel data

pyplot.imshow(image)

# show the figure

st.pyplot(py_fig)

return True

else:

return False

def upload_images():

uploaded_file = st.file_uploader("Please select an image", type=["png", "jpg", "jpeg"], accept_multiple_files=False)

if uploaded_file is not None:

target_size = (MAX_WIDTH_IMG, MAX_HEIGHT_IMG)

try:

uploaded_image = Image.open(uploaded_file)

except BaseException:

raise

else:

uploaded_image = uploaded_image.resize(target_size)

return uploaded_image, uploaded_file.name

else:

return None, None

def main():

st.title('Image Augmentation')

uploaded_image, image_name = upload_images()

if uploaded_image is not None:

augmentate_image(uploaded_image, image_name)

else:

st.write('Wait to upload a file.')

if __name__ == '__main__':

main()

Jason Brownlee March 26, 2021 at 6:20 am #

Well done!

Reply

Nafy Aidara March 30, 2021 at 10:28 am #

Lesson 7: Face Detection
After running the code obtained the picture with a red rectangle on the face
I also run another picture and the faces are detected
Thanks

Reply
- Jason Brownlee March 31, 2021 at 5:57 am #
  
  Well done!
  
  Reply
Rizu Hoshin April 19, 2021 at 1:55 pm #

Lesson 01: Research Application with Deep Learning & Computer Vision

1. Realtime Object Tracking (https://www.iccs-meeting.org/archive/iccs2018/papers/108600033.pdf)

2. Facial Recognition (https://www.researchgate.net/publication/325071878_Deep_Learning_for_Facial_Recognition)

3. Iris Recognition
(https://www.researchgate.net/publication/314194215_IRIS_RECOGNITION_BY_USING_IMAGE_PROCESSING_TECHNIQUES)

4. Speech Recognition
(https://www.datasciencecentral.com/profiles/blogs/machine-learning-is-fun-part-6-how-to-do-speech-recognition-with)

5. Vehicle Speed Estimation
(http://cs229.stanford.edu/proj2017/final-reports/5244226.pdf)

Reply
- Jason Brownlee April 20, 2021 at 5:53 am #
  
  Well done.
  
  Reply
Rizu Hoshin April 21, 2021 at 12:22 pm #

Data Type: uint8

Before
Min: 0.000, Max: 255.000

After
Min: 0.000, Max: 1.000

Reply
- Jason Brownlee April 22, 2021 at 5:36 am #
  
  Well done.
  
  Reply
Rizu Hoshin April 21, 2021 at 2:48 pm #

lesson 03 – CNN

Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 516128) 0
_________________________________________________________________
dense (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0
_________________________________________________________________

can you give me tips to build CNN architecture? i’m confuse because i don’t know the right combination for filtering and pooling the image.

Reply
- Jason Brownlee April 22, 2021 at 5:36 am #
  
  Well done.
  
  Reply
Konstantin May 31, 2021 at 6:44 pm #

1-5. Medicine application (a lot of research papers 🙂 )
Classification, segmentation, diagnostic, 3D analysis, image restoration etc

Reply
- Jason Brownlee June 1, 2021 at 5:29 am #
  
  Nice work!
  
  Reply
Konstantin June 2, 2021 at 7:02 am #

At last :)))))

Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 516128) 0
_________________________________________________________________
dense (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0
_________________________________________________________________

Process finished with exit code 0

Reply
- Jason Brownlee June 3, 2021 at 5:26 am #
  
  Well done!
  
  Reply
Keerthesh Reddy June 3, 2021 at 8:01 pm #

Lesson 1: Applications of DL in CV

Object detection
Object localization
Object segmentation
Pose Estimation
Object/Body measurements

Reply
- Jason Brownlee June 4, 2021 at 6:47 am #
  
  Well done!
  
  Reply
Keerthesh Reddy June 4, 2021 at 12:44 am #

Lesson 2: Pixel Normalization and standardization

Data Type: uint8
Min: 0.000, Max: 255.000
Min: 0.000000, Max:1.000000
standardization
Mean: 155.544, Standard Deviation: 51.411
Mean: -0.000, Standard Deviation: 1.000

Reply
- Jason Brownlee June 4, 2021 at 7:04 am #
  
  Well done!
  
  Reply
Keerthesh Reddy June 5, 2021 at 12:41 am #

Lesson 3: Convolutional Neural Network

Model: “sequential_7”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_7 (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 127, 127, 32) 0
_________________________________________________________________
flatten_7 (Flatten) (None, 516128) 0
_________________________________________________________________
dense_7 (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0
_________________________________________________________________

Reply
- Jason Brownlee June 5, 2021 at 5:30 am #
  
  Well done!
  
  Reply
Konstantin June 5, 2021 at 1:07 am #

Nice birds! Thank you ))))))))))))))))
I can not send you the photos

Reply
- Jason Brownlee June 5, 2021 at 5:30 am #
  
  Thanks!
  
  Reply
Konstantin June 8, 2021 at 6:22 am #

from matplotlib import pyplot
from matplotlib.patches import Rectangle
from mtcnn.mtcnn import MTCNN
# load image from file
pixels = pyplot.imread(‘two.jpg’)
# create the detector, using default weights
detector = MTCNN()
# detect faces in the image
faces = detector.detect_faces(pixels)
print(faces)
# plot the image
pyplot.imshow(pixels)
# get the context for drawing boxes
ax = pyplot.gca()
# get coordinates from the first face
x, y, width, height = faces[0][‘box’]
# create the shape
rect = Rectangle((x, y), width, height, fill=False, color=’red’)
# draw the box
ax.add_patch(rect)
# get coordinates from the first face
x, y, width, height = faces[1][‘box’]
# create the shape
rect = Rectangle((x, y), width, height, fill=False, color=’blue’)
# draw the box
ax.add_patch(rect)

# show the plot
pyplot.show()

Thank you!!!!!!1

Reply
- Jason Brownlee June 8, 2021 at 7:18 am #
  
  Well done!
  
  Reply
Pingpony June 10, 2021 at 4:58 pm #

task lesson 1

Automatic Screening of Diabetic Retinopathy Images with Convolution Neural Network Based on Caffe Framework

‘ https://dl.acm.org/doi/abs/10.1145/3107514.3107523?casa_token=Q6Ulyrz5JVAAAAAA%3AD6FWQWcBsGr7-VPdZzz3X5Lq4HohPII2FdTqtyh5qGyQwFbMc0n7Ukb9njD8iifjHyKSL1_ZH-7bmQ ‘

Deep Convolution Neural Network for Malignancy Detection and Classification in Microscopic Uterine Cervix Cell Images
‘https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7062987/’

A Full Stage Data Augmentation Method in Deep Convolutional Neural Network for Natural Image Classification
‘https://www.hindawi.com/journals/ddns/2020/4706576/’

A multi-scale recurrent fully convolution neural network for laryngeal leukoplakia segmentation
‘https://www.sciencedirect.com/science/article/pii/S1746809420300690?casa_token=E0pQhtiK7cIAAAAA:OxVgfpuxshh3QYiqxWKowka2KfIxW5U0oovOuurlVc3WiT7v2v4dzlTcwilfFiYv4Ba2ctqX1OI’

A dataset of laryngeal endoscopic images with comparative study on convolution neural network-based semantic segmentation
‘https://link.springer.com/article/10.1007/s11548-018-01910-0’

Reply
- Jason Brownlee June 11, 2021 at 5:13 am #
  
  Well done!
  
  Reply
Anand June 15, 2021 at 5:25 pm #

5 interesting applications of Deep Learning for Computer Vision:

1) Human Pose estimation: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42237.pdf

2) Social Distancing tools: https://arxiv.org/abs/1703.06870

3) Digitalizing Images by reading the text and recognizing objects (OCR for texts and CNN for objects ): https://storage.googleapis.com/pub-tools-public-publication-data/pdf/33418.pdf

4) Computer Vision for autonomous vehicles: https://www.nowpublishers.com/article/Details/CGV-079

5) Computer Vision for Metrology:
https://www.ipf.kit.edu/english/1577.php

Reply
- Jason Brownlee June 16, 2021 at 6:17 am #
  
  Well done!
  
  Reply
Sam Arumugam August 25, 2021 at 3:55 pm #

Day 3: Convolutional Neural Networks
Creates a convolutional neural network that expects grayscale images with the square size of 256×256 pixels, with one convolutional layer with 32 filters, each with the size of 3×3 pixels, a max-pooling layer, and a binary classification output layer.

Program -Output result.

Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 516128) 0
_________________________________________________________________
dense (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0

Reply
- Adrian Tam August 27, 2021 at 4:56 am #
  
  Good work!
  
  Reply
khushboo August 25, 2021 at 9:44 pm #

Day-2

2.3650445e-10
Min: 0.000, Max: 0.000

Reply
Sam Arumugam August 26, 2021 at 6:35 pm #

Day 4: Image Classification

Got this similar result on 1st running only. 2nd-time result nil.

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
58892288/58889256 [==============================] – 6s 0us/step
58900480/58889256 [==============================] – 6s 0us/step

note did not get persentage.

Reply
Sam Arumugam August 31, 2021 at 6:08 pm #

Day 4: Image Classification

Doberman (30.99%)

Reply
Rik Aulbers October 25, 2021 at 10:48 pm #

Lesson 01: Deep Learning and Computer Vision
================================

1. Image classification
2. Object detection
3. Object segmentation
4. Image colorization
5. Image reconstruction

Reply
Rik Aulbers October 25, 2021 at 11:14 pm #

Lesson 02: Preparing Image Data
================================

Before Normalization:
Data Type: uint8
Min: 0.000, Max: 255.000

After Normalization:
Data Type: float32
Min: 0.000, Max: 1.000

Reply
- Adrian Tam October 27, 2021 at 2:20 am #
  
  Good work!
  
  Reply
Kathrin Fl October 29, 2021 at 11:14 pm #

Lesson 3:

2021-10-29 14:02:42.961078: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘cudart64_110.dll’; dlerror: cudart64_110.dll not found
2021-10-29 14:02:42.961531: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

2021-10-29 14:02:46.000958: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘nvcuda.dll’; dlerror: nvcuda.dll not found
2021-10-29 14:02:46.001416: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-10-29 14:02:46.008546: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-EV70MUJ
2021-10-29 14:02:46.009171: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-EV70MUJ
2021-10-29 14:02:46.010344: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 516128) 0
_________________________________________________________________
dense (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0
_________________________________________________________________

Hi, what do I do with the warnings?

Thanks!

Reply
- Adrian Tam October 30, 2021 at 12:40 pm #
  
  Don’t need to care about. It’s just saying your tensorflow is not fully utilizing your computer’s power.
  
  Reply
Serenina November 11, 2021 at 1:23 am #

Lesson 01:

– Understanding cartoon emotion (DOI: 10.1007/s00521-021-06003-9)

– Segmentation of plant species and communities (DOI: 10.1038/s41598-019-53797-9)

– Postnatal gestational age estimation of newborns (DOI: 10.1016/j.imavis.2018.09.003)

– RootNav 2.0: Navigation of complex plant root architectures (DOI: 10.1093/gigascience/giz123)

– Transfer of Learning from Vision to Touch (DOI: 10.3390/s21010113)

Reply
- Adrian Tam November 14, 2021 at 12:26 pm #
  
  That’s a great list!
  
  Reply
Jose Luis Ortiz Volcan December 2, 2021 at 4:27 am #

Lesson 1

1. Diagnosis of oil wells undergoing artificial lift operations
2. Identification of Bottlenecks in supply chains of particular industry processes
3. Early identification of gas or oil leaks in oil and gas fields from analysis of images captured with drones
4. Land subsidence monitoring and evaluation in areas undergoing mining operations using satellite images
5. Early risk identification from visual analysis of key parameters vs. time plots in high fluid pressure operations

Reply
Jose Luis Ortiz Volcan December 2, 2021 at 4:50 am #

Lesson 02

Before normalization
Data Type: uint8
Min: 0.000, Max: 255.000

After normalization
Data Type: float32
Min: 0.000, Max: 1.000

Reply
- Adrian Tam December 8, 2021 at 5:48 am #
  
  Good job, Jose.
  
  Reply
Jose Luis Ortiz Volcan December 3, 2021 at 2:49 pm #

Ref. Deep Learning for Computer Vision Crash Course – Lesson 03

Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 516128) 0
_________________________________________________________________
dense (Dense) (None, 1) 516129
=================================================================
Total params: 516,449
Trainable params: 516,449
Non-trainable params: 0

Reply
Jose Luis Ortiz Volcan December 3, 2021 at 2:50 pm #

Ref. Deep Learning for Computer Vision Crash Course – Lesson 04

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5
553467904/553467096 [==============================] – 20s 0us/step
553476096/553467096 [==============================] – 20s 0us/step
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
40960/35363 [==================================] – 0s 0us/step
49152/35363 [=========================================] – 0s 0us/step
Doberman (35.42%)

Reply
- Adrian Tam December 8, 2021 at 6:55 am #
  
  That’s looks very good, Jose.
  
  Reply
Jose Luis Ortiz Volcan December 4, 2021 at 6:37 am #

Comments:
1. After message: “cannot import name ‘to_categorical’ from ‘keras.utils'” imported from tensorflow.keras.utils

2. After running the example the performance of the model on the test database is as follows:
Running time = 128.82744431495667 seconds
Test loss: 0.3041397035121918
Test accuracy: 0.9147999882698059

3. Varying the configuration of the model by adding another CNN

model2 = Sequential()
model2.add(Conv2D(32, (3, 3), activation=’relu’, kernel_initializer=’he_uniform’, input_shape=(28, 28, 1)))
model2.add(Conv2D(32, (5,5), activation =’relu’))
model2.add(MaxPooling2D())
model2.add(Flatten())
model2.add(Dense(100, activation=’relu’, kernel_initializer=’he_uniform’))
model2.add(Dense(10, activation=’softmax’))
model2.compile(optimizer=’adam’, loss=’categorical_crossentropy’, metrics=[‘accuracy’])
model2.summary()

Running time = 536.1874532699585 seconds
Test loss: 0.4339408576488495
Test accuracy: 0.9174000024795532

Conclusion: it improved a little bit but at the expense of increasing running time 316%.

Reply
- Adrian Tam December 8, 2021 at 7:21 am #
  
  hi Jose, for (1) it should be there, see Tensorflow documentation, https://www.tensorflow.org/api_docs/python/tf/keras/utils/to_categorical
  
  Reply
Jose Luis Ortiz Volcan December 9, 2021 at 6:12 am #

Thank you

Reply
Jose Luis Ortiz Volcan December 9, 2021 at 6:26 am #

Ref. Deep Learning for Computer Vision Crash Course – Lesson 06

After application of the code:

ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rotation_range=90)

the result is a set of 9 images randomly selected with horizontal and vertical shifts. Parameter ‘rescale by default is none.

I played with parameter ‘rescale’

ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rescale=0.5, rotation_range=90)
ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rescale=1.5, rotation_range=90)
ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rescale=3, rotation_range=90)

With the previous codes I created 3 sets (9 images each), for parameter ‘rescale’ set to 0.5 we get darker background.

With parameter ‘rescale’ set to 1.5 and 3 we get less darker background and even some noise.

Image augmentation is a powerful technique for improving the training of a deep neural networks such as CNN.

Reply
Jose Luis Ortiz Volcan December 9, 2021 at 12:23 pm #

The code provided for face detection with mtcnn on the given photograph worked very well as it drawed a box around the face.

mtcnn also worked very well in a photograph with multiple faces.

Reply
- Adrian Tam December 10, 2021 at 4:18 am #
  
  Thanks for confirming!
  
  Reply
Sulochana February 4, 2022 at 6:36 am #

Computer Vision Applications
1. Image Classification: Labelling the image based on what it consists of is classification(eg:cat/dog)
2. Image Classification With Localization: Identify the location of the object in the frame and create a bounding box around it, It is known as localization.
3. Object Detection:Detecting object while multiple objects present
4. Object Tracking: Track of a specific object in a sequence of images, or a video
5. Object Segmentation:Object Segmentation or Image segmentation is termed as categorizing each pixel value of an image to a particular class.
6. Style Transfer:Style transfer or neural style transfer is the task of learning style from one or more images and applying that style to a new image.

Reply
- James Carmichael February 4, 2022 at 10:19 am #
  
  Thank you for the feedback! Keep up the great work!
  
  Reply
Sulochana February 4, 2022 at 11:37 pm #

Lesson-2: Preparing Image Data

Data Type: uint8
Min: 0.000, Max: 255.000

After normalization
Min: -0.012, Max: 0.008

Mean: 0.610, Standard Deviation: 0.202
Mean: 0.000, Standard Deviation: 1.000

Reply
sam October 10, 2022 at 8:00 pm #

hi,

Why we have to change the data to ‘float32’ before normalization
=======================
# convert from integers to floats
trainX, testX = trainX.astype(‘float32’), testX.astype(‘float32’)
# normalize to range 0-1
trainX,testX = trainX / 255.0, testX / 255.0
===================================

I think without changing to float32. the data will change to float64 after normalization. The code works fine without changing to float32. Is it to save memory?

Thanks

Reply
- James Carmichael October 11, 2022 at 6:53 am #
  
  Hi sam…The conversion is necessary due to the mathematical procedures required in normalization:
  
  https://iq.opengenus.org/normalization-in-detail/
  
  Reply

Navigation

How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)

Deep Learning for Computer Vision Crash Course.
Bring Deep Learning Methods to Your Computer Vision Project in 7 Days.

Who Is This Crash-Course For?

Crash-Course Overview

Want Results with Deep Learning for Computer Vision?

Lesson 01: Deep Learning and Computer Vision

Computer Vision

Deep Learning

Promise of Deep Learning for Computer vision

Your Task

Lesson 02: Preparing Image Data

Your Task

Lesson 03: Convolutional Neural Networks

Convolutional Layers

Pooling Layers

Classifier Layer

Convolutional Neural Network

Your Task

Lesson 04: Image Classification

Your Task

Lesson 05: Train Image Classification Model

Your Task

Lesson 06: Image Augmentation

Your Task

Lesson 07: Face Detection

Your Task

The End!
(Look How Far You Have Come)

Summary

Develop Deep Learning Models for Vision Today!

Develop Your Own Vision Models in Minutes

Finally Bring Deep Learning to your Vision Projects

More On This Topic

299 Responses to How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)

Leave a Reply Click here to cancel reply.

Navigation

Deep Learning for Computer Vision Crash Course. Bring Deep Learning Methods to Your Computer Vision Project in 7 Days.

Who Is This Crash-Course For?

Crash-Course Overview

Want Results with Deep Learning for Computer Vision?

Lesson 01: Deep Learning and Computer Vision

Computer Vision

Deep Learning

Promise of Deep Learning for Computer vision

Your Task

Lesson 02: Preparing Image Data

Your Task

Lesson 03: Convolutional Neural Networks

Convolutional Layers

Pooling Layers

Classifier Layer

Convolutional Neural Network

Your Task

Lesson 04: Image Classification

Your Task

Lesson 05: Train Image Classification Model

Your Task

Lesson 06: Image Augmentation

Your Task

Lesson 07: Face Detection

Your Task

The End! (Look How Far You Have Come)

Summary

Develop Deep Learning Models for Vision Today!

Develop Your Own Vision Models in Minutes

Finally Bring Deep Learning to your Vision Projects

More On This Topic

299 Responses to How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)

Leave a Reply Click here to cancel reply.

Deep Learning for Computer Vision Crash Course.
Bring Deep Learning Methods to Your Computer Vision Project in 7 Days.

The End!
(Look How Far You Have Come)