Your First Deep Learning Project in Python with Keras Step-by-Step

By Jason Brownlee on August 16, 2022 in Deep Learning 1,171

Keras is a powerful and easy-to-use free open source Python library for developing and evaluating deep learning models.

It is part of the TensorFlow library and allows you to define and train neural network models in just a few lines of code.

In this tutorial, you will discover how to create your first deep learning neural network model in Python using Keras.

Kick-start your project with my new book Deep Learning With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Update Feb/2017: Updated prediction example, so rounding works in Python 2 and 3.
Update Mar/2017: Updated example for the latest versions of Keras and TensorFlow.
Update Mar/2018: Added alternate link to download the dataset.
Update Jul/2019: Expanded and added more useful resources.
Update Sep/2019: Updated for Keras v2.2.5 API.
Update Oct/2019: Updated for Keras v2.3.0 API and TensorFlow v2.0.0.
Update Aug/2020: Updated for Keras v2.4.3 and TensorFlow v2.3.
Update Oct/2021: Deprecated predict_class syntax
Update Jun/2022: Updated to modern TensorFlow syntax

Develop your first neural network in Python with Keras step-by-step
Photo by Phil Whitehouse, some rights reserved.

Keras Tutorial Overview

There is not a lot of code required, but we will go over it slowly so that you will know how to create your own models in the future.

The steps you will learn in this tutorial are as follows:

Load Data
Define Keras Model
Compile Keras Model
Fit Keras Model
Evaluate Keras Model
Tie It All Together
Make Predictions

This Keras tutorial makes a few assumptions. You will need to have:

Python 2 or 3 installed and configured
SciPy (including NumPy) installed and configured
Keras and a backend (Theano or TensorFlow) installed and configured

If you need help with your environment, see the tutorial:

How to Setup a Python Environment for Deep Learning

Create a new file called keras_first_network.py and type or copy-and-paste the code into the file as you go.

Need help with Deep Learning in Python?

Take my free 2-week email course and discover MLPs, CNNs and LSTMs (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

1. Load Data

The first step is to define the functions and classes you intend to use in this tutorial.

You will use the NumPy library to load your dataset and two classes from the Keras library to define your model.

The imports required are listed below.

# first neural network with keras tutorial
from numpy import loadtxt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
...

# first neural network with keras tutorial

from numpy import loadtxt

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

...

You can now load our dataset.

In this Keras tutorial, you will use the Pima Indians onset of diabetes dataset. This is a standard machine learning dataset from the UCI Machine Learning repository. It describes patient medical record data for Pima Indians and whether they had an onset of diabetes within five years.

As such, it is a binary classification problem (onset of diabetes as 1 or not as 0). All of the input variables that describe each patient are numerical. This makes it easy to use directly with neural networks that expect numerical input and output values and is an ideal choice for our first neural network in Keras.

The dataset is available here:

Download the dataset and place it in your local working directory, the same location as your Python file.

Save it with the filename:

pima-indians-diabetes.csv

1	pima-indians-diabetes.csv

Take a look inside the file; you should see rows of data like the following:

6,148,72,35,0,33.6,0.627,50,1
1,85,66,29,0,26.6,0.351,31,0
8,183,64,0,0,23.3,0.672,32,1
1,89,66,23,94,28.1,0.167,21,0
0,137,40,35,168,43.1,2.288,33,1
...

6,148,72,35,0,33.6,0.627,50,1

1,85,66,29,0,26.6,0.351,31,0

8,183,64,0,0,23.3,0.672,32,1

1,89,66,23,94,28.1,0.167,21,0

0,137,40,35,168,43.1,2.288,33,1

...

You can now load the file as a matrix of numbers using the NumPy function loadtxt().

There are eight input variables and one output variable (the last column). You will be learning a model to map rows of input variables (X) to an output variable (y), which is often summarized as y = f(X).

The variables can be summarized as follows:

Input Variables (X):

Number of times pregnant
Plasma glucose concentration at 2 hours in an oral glucose tolerance test
Diastolic blood pressure (mm Hg)
Triceps skin fold thickness (mm)
2-hour serum insulin (mu U/ml)
Body mass index (weight in kg/(height in m)^2)
Diabetes pedigree function
Age (years)

Output Variables (y):

Class variable (0 or 1)

Once the CSV file is loaded into memory, you can split the columns of data into input and output variables.

The data will be stored in a 2D array where the first dimension is rows and the second dimension is columns, e.g., [rows, columns].

You can split the array into two arrays by selecting subsets of columns using the standard NumPy slice operator or “:”. You can select the first eight columns from index 0 to index 7 via the slice 0:8. We can then select the output column (the 9th variable) via index 8.

...
# load the dataset
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]
...

...

# load the dataset

dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')

# split into input (X) and output (y) variables

X = dataset[:,0:8]

y = dataset[:,8]

...

You are now ready to define your neural network model.

Note: The dataset has nine columns, and the range 0:8 will select columns from 0 to 7, stopping before index 8. If this is new to you, then you can learn more about array slicing and ranges in this post:

How to Index, Slice, and Reshape NumPy Arrays for Machine Learning in Python

2. Define Keras Model

Models in Keras are defined as a sequence of layers.

We create a Sequential model and add layers one at a time until we are happy with our network architecture.

The first thing to get right is to ensure the input layer has the correct number of input features. This can be specified when creating the first layer with the input_shape argument and setting it to (8,) for presenting the eight input variables as a vector.

How do we know the number of layers and their types?

This is a tricky question. There are heuristics that you can use, and often the best network structure is found through a process of trial and error experimentation (I explain more about this here). Generally, you need a network large enough to capture the structure of the problem.

In this example, let’s use a fully-connected network structure with three layers.

Fully connected layers are defined using the Dense class. You can specify the number of neurons or nodes in the layer as the first argument and the activation function using the activation argument.

Also, you will use the rectified linear unit activation function referred to as ReLU on the first two layers and the Sigmoid function in the output layer.

It used to be the case that Sigmoid and Tanh activation functions were preferred for all layers. These days, better performance is achieved using the ReLU activation function. Using a sigmoid on the output layer ensures your network output is between 0 and 1 and is easy to map to either a probability of class 1 or snap to a hard classification of either class with a default threshold of 0.5.

You can piece it all together by adding each layer:

The model expects rows of data with 8 variables (the input_shape=(8,) argument).
The first hidden layer has 12 nodes and uses the relu activation function.
The second hidden layer has 8 nodes and uses the relu activation function.
The output layer has one node and uses the sigmoid activation function.

...
# define the keras model
model = Sequential()
model.add(Dense(12, input_shape=(8,), activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
...

...

# define the keras model

model = Sequential()

model.add(Dense(12, input_shape=(8,), activation='relu'))

model.add(Dense(8, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

...

Note: The most confusing thing here is that the shape of the input to the model is defined as an argument on the first hidden layer. This means that the line of code that adds the first Dense layer is doing two things, defining the input or visible layer and the first hidden layer.

3. Compile Keras Model

Now that the model is defined, you can compile it.

Compiling the model uses the efficient numerical libraries under the covers (the so-called backend) such as Theano or TensorFlow. The backend automatically chooses the best way to represent the network for training and making predictions to run on your hardware, such as CPU, GPU, or even distributed.

When compiling, you must specify some additional properties required when training the network. Remember training a network means finding the best set of weights to map inputs to outputs in your dataset.

You must specify the loss function to use to evaluate a set of weights, the optimizer used to search through different weights for the network, and any optional metrics you want to collect and report during training.

In this case, use cross entropy as the loss argument. This loss is for a binary classification problems and is defined in Keras as “binary_crossentropy“. You can learn more about choosing loss functions based on your problem here:

How to Choose Loss Functions When Training Deep Learning Neural Networks

We will define the optimizer as the efficient stochastic gradient descent algorithm “adam“. This is a popular version of gradient descent because it automatically tunes itself and gives good results in a wide range of problems. To learn more about the Adam version of stochastic gradient descent, see the post:

Gentle Introduction to the Adam Optimization Algorithm for Deep Learning

Finally, because it is a classification problem, you will collect and report the classification accuracy defined via the metrics argument.

...
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
...

...

# compile the keras model

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

...

4. Fit Keras Model

You have defined your model and compiled it to get ready for efficient computation.

Now it is time to execute the model on some data.

You can train or fit your model on your loaded data by calling the fit() function on the model.

Training occurs over epochs, and each epoch is split into batches.

Epoch: One pass through all of the rows in the training dataset
Batch: One or more samples considered by the model within an epoch before weights are updated

One epoch comprises one or more batches, based on the chosen batch size, and the model is fit for many epochs. For more on the difference between epochs and batches, see the post:

What is the Difference Between a Batch and an Epoch in a Neural Network?

The training process will run for a fixed number of epochs (iterations) through the dataset that you must specify using the epochs argument. You must also set the number of dataset rows that are considered before the model weights are updated within each epoch, called the batch size, and set using the batch_size argument.

This problem will run for a small number of epochs (150) and use a relatively small batch size of 10.

These configurations can be chosen experimentally by trial and error. You want to train the model enough so that it learns a good (or good enough) mapping of rows of input data to the output classification. The model will always have some error, but the amount of error will level out after some point for a given model configuration. This is called model convergence.

...
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10)
...

...

# fit the keras model on the dataset

model.fit(X, y, epochs=150, batch_size=10)

...

This is where the work happens on your CPU or GPU.

No GPU is required for this example, but if you’re interested in how to run large models on GPU hardware cheaply in the cloud, see this post:

How to Setup Amazon AWS EC2 GPUs to Train Keras Deep Learning Models

5. Evaluate Keras Model

You have trained our neural network on the entire dataset, and you can evaluate the performance of the network on the same dataset.

This will only give you an idea of how well you have modeled the dataset (e.g., train accuracy), but no idea of how well the algorithm might perform on new data. This was done for simplicity, but ideally, you could separate your data into train and test datasets for training and evaluation of your model.

You can evaluate your model on your training dataset using the evaluate() function and pass it the same input and output used to train the model.

This will generate a prediction for each input and output pair and collect scores, including the average loss and any metrics you have configured, such as accuracy.

The evaluate() function will return a list with two values. The first will be the loss of the model on the dataset, and the second will be the accuracy of the model on the dataset. You are only interested in reporting the accuracy so ignore the loss value.

...
# evaluate the keras model
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))

...

# evaluate the keras model

_, accuracy = model.evaluate(X, y)

print('Accuracy: %.2f' % (accuracy*100))

6. Tie It All Together

You have just seen how you can easily create your first neural network model in Keras.

Let’s tie it all together into a complete code example.

# first neural network with keras tutorial
from numpy import loadtxt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# load the dataset
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]
# define the keras model
model = Sequential()
model.add(Dense(12, input_shape=(8,), activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))

# first neural network with keras tutorial

from numpy import loadtxt

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

# load the dataset

dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')

# split into input (X) and output (y) variables

X = dataset[:,0:8]

y = dataset[:,8]

# define the keras model

model = Sequential()

model.add(Dense(12, input_shape=(8,), activation='relu'))

model.add(Dense(8, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

# compile the keras model

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# fit the keras model on the dataset

model.fit(X, y, epochs=150, batch_size=10)

# evaluate the keras model

_, accuracy = model.evaluate(X, y)

print('Accuracy: %.2f' % (accuracy*100))

You can copy all the code into your Python file and save it as “keras_first_network.py” in the same directory as your data file “pima-indians-diabetes.csv“. You can then run the Python file as a script from your command line (command prompt) as follows:

python keras_first_network.py

1	python keras_first_network.py

Running this example, you should see a message for each of the 150 epochs, printing the loss and accuracy, followed by the final evaluation of the trained model on the training dataset.

It takes about 10 seconds to execute on my workstation running on the CPU.

Ideally, you would like the loss to go to zero and the accuracy to go to 1.0 (e.g., 100%). This is not possible for any but the most trivial machine learning problems. Instead, you will always have some error in your model. The goal is to choose a model configuration and training configuration that achieve the lowest loss and highest accuracy possible for a given dataset.

...
768/768 [==============================] - 0s 63us/step - loss: 0.4817 - acc: 0.7708
Epoch 147/150
768/768 [==============================] - 0s 63us/step - loss: 0.4764 - acc: 0.7747
Epoch 148/150
768/768 [==============================] - 0s 63us/step - loss: 0.4737 - acc: 0.7682
Epoch 149/150
768/768 [==============================] - 0s 64us/step - loss: 0.4730 - acc: 0.7747
Epoch 150/150
768/768 [==============================] - 0s 63us/step - loss: 0.4754 - acc: 0.7799
768/768 [==============================] - 0s 38us/step
Accuracy: 76.56

...

768/768 [==============================] - 0s 63us/step - loss: 0.4817 - acc: 0.7708

Epoch 147/150

768/768 [==============================] - 0s 63us/step - loss: 0.4764 - acc: 0.7747

Epoch 148/150

768/768 [==============================] - 0s 63us/step - loss: 0.4737 - acc: 0.7682

Epoch 149/150

768/768 [==============================] - 0s 64us/step - loss: 0.4730 - acc: 0.7747

Epoch 150/150

768/768 [==============================] - 0s 63us/step - loss: 0.4754 - acc: 0.7799

768/768 [==============================] - 0s 38us/step

Accuracy: 76.56

Note: If you try running this example in an IPython or Jupyter notebook, you may get an error.

The reason is the output progress bars during training. You can easily turn these off by setting verbose=0 in the call to the fit() and evaluate() functions; for example:

...
# fit the keras model on the dataset without progress bars
model.fit(X, y, epochs=150, batch_size=10, verbose=0)
# evaluate the keras model
_, accuracy = model.evaluate(X, y, verbose=0)
...

...

# fit the keras model on the dataset without progress bars

model.fit(X, y, epochs=150, batch_size=10, verbose=0)

# evaluate the keras model

_, accuracy = model.evaluate(X, y, verbose=0)

...

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

What score did you get?
Post your results in the comments below.

Neural networks are stochastic algorithms, meaning that the same algorithm on the same data can train a different model with different skill each time the code is run. This is a feature, not a bug. You can learn more about this in the post:

Embrace Randomness in Machine Learning

The variance in the performance of the model means that to get a reasonable approximation of how well your model is performing, you may need to fit it many times and calculate the average of the accuracy scores. For more on this approach to evaluating neural networks, see the post:

How to Evaluate the Skill of Deep Learning Models

For example, below are the accuracy scores from re-running the example five times:

Accuracy: 75.00
Accuracy: 77.73
Accuracy: 77.60
Accuracy: 78.12
Accuracy: 76.17

Accuracy: 75.00

Accuracy: 77.73

Accuracy: 77.60

Accuracy: 78.12

Accuracy: 76.17

You can see that all accuracy scores are around 77%, and the average is 76.924%.

7. Make Predictions

The number one question I get asked is:

“After I train my model, how can I use it to make predictions on new data?”

Great question.

You can adapt the above example and use it to generate predictions on the training dataset, pretending it is a new dataset you have not seen before.

Making predictions is as easy as calling the predict() function on the model. You are using a sigmoid activation function on the output layer, so the predictions will be a probability in the range between 0 and 1. You can easily convert them into a crisp binary prediction for this classification task by rounding them.

For example:

...
# make probability predictions with the model
predictions = model.predict(X)
# round predictions 
rounded = [round(x[0]) for x in predictions]

...

# make probability predictions with the model

predictions = model.predict(X)

# round predictions

rounded = [round(x[0]) for x in predictions]

Alternately, you can convert the probability into 0 or 1 to predict crisp classes directly; for example:

...
# make class predictions with the model
predictions = (model.predict(X) > 0.5).astype(int)

...

# make class predictions with the model

predictions = (model.predict(X) > 0.5).astype(int)

The complete example below makes predictions for each example in the dataset, then prints the input data, predicted class, and expected class for the first five examples in the dataset.

# first neural network with keras make predictions
from numpy import loadtxt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# load the dataset
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]
# define the keras model
model = Sequential()
model.add(Dense(12, input_shape=(8,), activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10, verbose=0)
# make class predictions with the model
predictions = (model.predict(X) > 0.5).astype(int)
# summarize the first 5 cases
for i in range(5):
	print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))

# first neural network with keras make predictions

from numpy import loadtxt

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

# load the dataset

dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')

# split into input (X) and output (y) variables

X = dataset[:,0:8]

y = dataset[:,8]

# define the keras model

model = Sequential()

model.add(Dense(12, input_shape=(8,), activation='relu'))

model.add(Dense(8, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

# compile the keras model

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# fit the keras model on the dataset

model.fit(X, y, epochs=150, batch_size=10, verbose=0)

# make class predictions with the model

predictions = (model.predict(X) > 0.5).astype(int)

# summarize the first 5 cases

for i in range(5):

print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))

Running the example does not show the progress bar as before, as the verbose argument has been set to 0.

After the model is fit, predictions are made for all examples in the dataset, and the input rows and predicted class value for the first five examples is printed and compared to the expected class value.

You can see that most rows are correctly predicted. In fact, you can expect about 76.9% of the rows to be correctly predicted based on your estimated performance of the model in the previous section.

[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 0 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 1 (expected 1)

[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 0 (expected 1)

[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0)

[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 1 (expected 1)

[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0)

[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 1 (expected 1)

If you would like to know more about how to make predictions with Keras models, see the post:

How to Make Predictions with Keras

Keras Tutorial Summary

In this post, you discovered how to create your first neural network model using the powerful Keras Python library for deep learning.

Specifically, you learned the six key steps in using Keras to create a neural network or deep learning model step-by-step, including:

How to load data
How to define a neural network in Keras
How to compile a Keras model using the efficient numerical backend
How to train a model on data
How to evaluate a model on data
How to make predictions with the model

Do you have any questions about Keras or about this tutorial?
Ask your question in the comments, and I will do my best to answer.

Keras Tutorial Extensions

Well done, you have successfully developed your first neural network using the Keras deep learning library in Python.

This section provides some extensions to this tutorial that you might want to explore.

Tune the Model. Change the configuration of the model or training process and see if you can improve the performance of the model, e.g., achieve better than 76% accuracy.
Save the Model. Update the tutorial to save the model to a file, then load it later and use it to make predictions (see this tutorial).
Summarize the Model. Update the tutorial to summarize the model and create a plot of model layers (see this tutorial).
Separate, Train, and Test Datasets. Split the loaded dataset into a training and test set (split based on rows) and use one set to train the model and the other set to estimate the performance of the model on new data.
Plot Learning Curves. The fit() function returns a history object that summarizes the loss and accuracy at the end of each epoch. Create line plots of this data, called learning curves (see this tutorial).
Learn a New Dataset. Update the tutorial to use a different tabular dataset, perhaps from the UCI Machine Learning Repository.
Use Functional API. Update the tutorial to use the Keras Functional API for defining the model (see this tutorial).

Saurav May 27, 2016 at 11:08 pm #

The input layer doesn’t have any activation function, but still activation=”relu” is mentioned in the first layer of the model. Why?

Reply
- Jason Brownlee May 28, 2016 at 6:32 am #
  
  Hi Saurav,
  
  The first layer in the network here is technically a hidden layer, hence it has an activation function.
  
  Reply
  - sam Johnson December 21, 2016 at 2:44 am #
    
    Why have you made it a hidden layer though? the input layer is not usually represented as a hidden layer?
    
    Reply
    - Jason Brownlee December 21, 2016 at 8:41 am #
      
      Hi sam,
      
      Note this line:
      
      model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
      
      1
      
      model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
      
      It does a few things.
      
      It defines the input layer as having 8 inputs.
      
      It defines a hidden layer with 12 neurons, connected to the input layer that use relu activation function.
      
      It initializes all weights using a sample of uniform random numbers.
      
      Does that help?
      
      Reply
      - Pavidevi May 17, 2017 at 2:31 am #
        
        Hi Jason,
        
        U have used two different activation functions so how can we know which activation function fit the model?
      - Jason Brownlee May 17, 2017 at 8:38 am #
        
        Sorry, I don’t understand the question.
      - Marco Cheung August 23, 2017 at 12:51 am #
        
        Hi Jason,
        
        I am interested in deep learning and machine learning. You mentioned “It defines a hidden layer with 12 neurons, connected to the input layer that use relu activation function.” I wonder how can we determine the number of neurons in order to achieve a high accuracy rate of the model?
        
        Thanks a lot!!!
      - Jason Brownlee August 23, 2017 at 6:55 am #
        
        Use trial and error. We cannot specify the “best” number of neurons analytically. We must test.
      - Ramzan Shahid November 10, 2017 at 4:32 am #
        
        Sir, thanks for your tutorial. Would you like to make tutorial on stock Data Prediction through Neural Network Model and training this on any stock data. If you have on this so please share the link. Thanks
      - Jason Brownlee November 10, 2017 at 10:39 am #
        
        I am reticent to post tutorials on stock market prediction given the random walk hypothesis of security prices:
        https://machinelearningmastery.com/gentle-introduction-random-walk-times-series-forecasting-python/
      - Dhara Bhavsar August 28, 2019 at 9:54 pm #
        
        Hi,
        
        I would like to know more about activation function. How it is working? How many activation functions? Using different activation function How much affect the output of the model?
        
        I would like to also know about the Hidden Layer. How the size of the hidden layer affect the model?
      - Jason Brownlee August 29, 2019 at 6:09 am #
        
        In this tutorial, we use relu in the hidden layers, learn more here:
        https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks/
        
        The size of the layer impacts the capacity of the model, learn more here:
        https://machinelearningmastery.com/how-to-control-neural-network-model-capacity-with-nodes-and-layers/
  - dhani June 28, 2018 at 2:44 am #
    
    hi how use cnn for pixel classification on mhd images
    
    Reply
    - Jason Brownlee June 28, 2018 at 6:22 am #
      
      What is pixel classification? What are mhd images?
      
      Reply
      - Seth Hammock March 6, 2024 at 7:52 am #
        
        Are you talking about neural style transfer?
      - James Carmichael March 6, 2024 at 10:29 am #
        
        Hi Seth…That is an important topic. More can be found here:
        
        https://towardsdatascience.com/implementing-neural-style-transfer-using-pytorch-fd8d43fb7bfa
  - Tanmay Kulkarni February 11, 2020 at 5:50 am #
    
    Hello! I want to know if there’s a way to know the values of all weights after each updation?
    
    Reply
    - Jason Brownlee February 11, 2020 at 5:53 am #
      
      Yes, you can save them to file or review them manually.
      
      Often saving is achieved using a checkpoint:
      https://machinelearningmastery.com/check-point-deep-learning-models-keras/
      
      Reply
- BlackBookKeeper August 18, 2018 at 10:15 pm #
  
  runfile(‘C:/Users/Owner/Documents/untitled1.py’, wdir=’C:/Users/Owner/Documents’)
  Traceback (most recent call last):
  
  File “”, line 1, in
  runfile(‘C:/Users/Owner/Documents/untitled1.py’, wdir=’C:/Users/Owner/Documents’)
  
  File “C:\Users\Owner\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 705, in runfile
  execfile(filename, namespace)
  
  File “C:\Users\Owner\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 102, in execfile
  exec(compile(f.read(), filename, ‘exec’), namespace)
  
  File “C:/Users/Owner/Documents/untitled1.py”, line 13, in
  model.add(Dense(12, input_dim=8, activation=’relu’))
  
  File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\engine\sequential.py”, line 160, in add
  name=layer.name + ‘_input’)
  
  File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\engine\input_layer.py”, line 177, in Input
  input_tensor=tensor)
  
  File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\legacy\interfaces.py”, line 91, in wrapper
  return func(*args, **kwargs)
  
  File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\engine\input_layer.py”, line 86, in __init__
  name=self.name)
  
  File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py”, line 515, in placeholder
  x = tf.placeholder(dtype, shape=shape, name=name)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\array_ops.py”, line 1530, in placeholder
  return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_array_ops.py”, line 1954, in _placeholder
  name=name)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py”, line 767, in apply_op
  op_def=op_def)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py”, line 2508, in create_op
  set_shapes_for_outputs(ret)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py”, line 1894, in set_shapes_for_outputs
  output.set_shape(s)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py”, line 443, in set_shape
  self._shape = self._shape.merge_with(shape)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 550, in merge_with
  stop = key.stop
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 798, in as_shape
  “””Returns this shape as a TensorShapeProto.”””
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 431, in __init__
  size for one or more dimension. e.g. TensorShape([None, 256])
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 376, in as_dimension
  other = as_dimension(other)
  
  File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 32, in __init__
  if value is None:
  
  TypeError: int() argument must be a string, a bytes-like object or a number, not ‘TensorShapeProto’
  
  this error occurs when {model.add(Dense(12, input_dim=8, activation=’relu’))} this command is run
  
  any help?
  
  Reply
  - Jason Brownlee August 19, 2018 at 6:20 am #
    
    Save all code into a file and run it as follows:
    https://machinelearningmastery.com/faq/single-faq/how-do-i-run-a-script-from-the-command-line
    
    Reply
- Penchalaiah December 8, 2019 at 6:24 pm #
  
  Fantastic tutorial. The explanation is simple and precise. Thanks a lot
  
  Reply
  - Jason Brownlee December 9, 2019 at 6:47 am #
    
    Thanks!
    
    Reply
- Loc June 29, 2022 at 1:00 pm #
  
  great arttist
  
  Reply
Geoff May 29, 2016 at 6:18 am #

Can you explain how to implement weight regularization into the layers?

Reply
- Jason Brownlee June 15, 2016 at 5:50 am #
  
  Yep, see here:
  http://keras.io/regularizers/
  
  Reply
  - afthab October 5, 2018 at 8:32 pm #
    
    hey yo!!! how u r start coding in python
    
    Reply
    - Jason Brownlee October 6, 2018 at 5:43 am #
      
      Start here:
      https://machinelearningmastery.com/faq/single-faq/how-do-i-get-started-with-python-programming
      
      Reply
KWC June 14, 2016 at 12:08 pm #

Import statements if others need them:

from keras.models import Sequential
from keras.layers import Dense, Activation

Reply
- Jason Brownlee June 15, 2016 at 5:49 am #
  
  Thanks.
  
  I had them in Part 6, but I have also added them to Part 1.
  
  Reply
  - Shiran January 20, 2020 at 11:30 am #
    
    Great post!
    Is it possible to train a neural network that receives as input a vector x and tries to predict another vector y where both x and y are floats?
    
    Reply
    - Jason Brownlee January 20, 2020 at 2:07 pm #
      
      Yes, this is called regression:
      https://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/
      
      Reply
Aakash Nain June 29, 2016 at 6:00 pm #

If there are 8 inputs for the first layer then why we have taken them as ’12’ in the following line :

model.add(Dense(12, input_dim=8, init=’uniform’, activation=’relu’))

Reply
- Jason Brownlee June 30, 2016 at 6:47 am #
  
  Hi Aakash.
  
  The input layer is defined by the input_dim parameter, here set to 8.
  
  The first hidden layer has 12 neurons.
  
  Reply
Joshua July 2, 2016 at 12:04 am #

I ran your program and i have an error:
ValueError: could not convert string to float:
what could be the reason for this, and how may I solve it.
thanks.
great post by the way.

Reply
- Jason Brownlee July 2, 2016 at 6:20 am #
  
  It might be a copy-paste error. Perhaps try to copy and run the whole example listed in section 6?
  
  Reply
  - Akash September 28, 2018 at 11:12 am #
    
    Hello sir, I am facing the same problem valueError: could not convert string to float: ‘”6’
    also I am running the example from section 6.
    
    Reply
    - Jason Brownlee September 28, 2018 at 3:00 pm #
      
      I have some suggestions here:
      https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
      
      Reply
  - yashu October 5, 2018 at 8:28 pm #
    
    jason can u plzz help me how to code
    
    Reply
    - Jason Brownlee October 6, 2018 at 5:42 am #
      
      Sorry, I cannot help you to write code.
      
      Reply
- KeyChy July 3, 2019 at 5:45 pm #
  
  Maybe when you set all parameters in an extra column in your *.csv file. Than you schould replace the delimiter from , to ; like:
  dataset = numpy.loadtxt(“pima-indians-diabetes.csv”, delimiter=”;”)
  This solved the Problem for me.
  
  Reply
  - Jason Brownlee July 4, 2019 at 7:40 am #
    
    Thanks for sharing.
    
    Reply
cheikh brahim July 5, 2016 at 7:40 pm #

thank you for your simple and useful example.

Reply
- Jason Brownlee July 6, 2016 at 6:22 am #
  
  You’re welcome cheikh.
  
  Reply
Nikhil Thakur July 6, 2016 at 6:39 pm #

Hello Sir, I am trying to use Keras for NLP , specifically sentence classification. I have given the model building part below. It’s taking quite a lot time to execute. I am using Pycharm IDE.

batch_size = 32
nb_filter = 250
filter_length = 3
nb_epoch = 2
pool_length = 2
output_dim = 5
hidden_dims = 250

# Build the model

model1 = Sequential()

model1.add(Convolution1D(nb_filter, filter_length ,activation=’relu’,border_mode=’valid’,
input_shape=(len(embb_weights),dim), weights=[embb_weights]))

model1.add(Dense(hidden_dims))
model1.add(Dropout(0.2))
model1.add(Activation(‘relu’))

model1.add(MaxPooling1D(pool_length=pool_length))

model1.add(Dense(output_dim, activation=’sigmoid’))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)

model1.compile(loss=’mean_squared_error’,
optimizer=sgd,
metrics=[‘accuracy’])

Reply
- Jason Brownlee July 7, 2016 at 7:31 am #
  
  You may want a larger network. You may also want to use a standard repeating structure like CNN->CNN->Pool->Dense.
  
  See this post on using a CNN:
  https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
  
  Later, you may also want to try some stacked LSTMs.
  
  Reply
Andre Norman July 15, 2016 at 10:40 am #

Hi Jason, thanks for the awesome example. Given that the accuracy of this model is 79.56%. From here on, what steps would you take to improve the accuracy?

Given my nascent understanding of Machine Learning, my initial approach would have been:

Implement forward propagation, then compute the cost function, then implement back propagation, use gradient checking to evaluate my network (disable after use), then use gradient descent.

However, this approach seems arduous compared to using Keras. Thanks for your response.

Reply
- Jason Brownlee July 15, 2016 at 10:52 am #
  
  Hi Andre, indeed Keras makes working with neural nets so much easier. Fun even!
  
  We may be maxing out on this problem, but here is some general advice for lifting performance.
  – data prep – try lots of different views of the problem and see which is best at exposing the structure of the problem to the learning algorithm (data transforms, feature engineering, etc.)
  – algorithm selection – try lots of algorithms and see which one or few are best on the problem (try on all views)
  – algorithm tuning – tune well performing algorithms to get the most out of them (grid search or random search hyperparameter tuning)
  – ensembles – combine predictions from multiple algorithms (stacking, boosting, bagging, etc.)
  
  For neural nets, there are a lot of things to tune, I think there are big gains in trying different network topologies (layers and number of neurons per layer) in concert with training epochs and learning rate (bigger nets need more training).
  
  I hope that helps as a start.
  
  Reply
  - Andre Norman July 18, 2016 at 7:19 am #
    
    Awesome! Thanks Jason =)
    
    Reply
    - Jason Brownlee July 18, 2016 at 8:03 am #
      
      You’re welcome Andre.
      
      Reply
- quentin August 7, 2017 at 8:41 pm #
  
  Some interesting stuff here
  https://youtu.be/vq2nnJ4g6N0
  
  Reply
  - Jason Brownlee August 8, 2017 at 7:49 am #
    
    Thanks for sharing. What did you like about it?
    
    Reply
Romilly Cocking July 21, 2016 at 12:31 am #

Hi Jason, it’s a great example but if anyone runs it in an IPython/Jupyter notebook they are likely to encounter an I/O error when running the fit step. This is due to a known bug in IPython.

The solution is to set verbose=0 like this

# Fit the model
model.fit(X, Y, nb_epoch=40, batch_size=10, verbose=0)

Reply
- Jason Brownlee July 21, 2016 at 5:36 am #
  
  Great, thanks for sharing Romilly.
  
  Reply

Anirban July 23, 2016 at 10:20 pm #

Great example. Have a query though. How do I now give a input and get the output (0 or 1). Can you pls give the cmd for that.
Thanks

Jason Brownlee July 24, 2016 at 6:53 am #

You can call model.predict() to get predictions and round on each value to snap to a binary value.

For example, below is a complete example showing you how to round the predictions and print them to console.

# Create first network with Keras
from keras.models import Sequential
from keras.layers import Dense
import numpy
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load pima indians dataset
dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X, Y, nb_epoch=150, batch_size=10,  verbose=2)
# calculate predictions
predictions = model.predict(X)
# round predictions
rounded = [round(x) for x in predictions]
print(rounded)

# Create first network with Keras

from keras.models import Sequential

from keras.layers import Dense

import numpy

# fix random seed for reproducibility

seed = 7

numpy.random.seed(seed)

# load pima indians dataset

dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")

# split into input (X) and output (Y) variables

X = dataset[:,0:8]

Y = dataset[:,8]

# create model

model = Sequential()

model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))

model.add(Dense(8, init='uniform', activation='relu'))

model.add(Dense(1, init='uniform', activation='sigmoid'))

# Compile model

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model

model.fit(X, Y, nb_epoch=150, batch_size=10, verbose=2)

# calculate predictions

predictions = model.predict(X)

# round predictions

rounded = [round(x) for x in predictions]

print(rounded)

Debanjan March 27, 2017 at 12:04 pm #

Hi, Why you are not using any test set? You are predicting from the training set , I think.

Reply
- Jason Brownlee March 28, 2017 at 8:19 am #
  
  Correct, it is just an example to get you started with Keras.
  
  Reply
David June 26, 2017 at 12:24 am #

Jason, I’m not quite understanding how the predicted values ([1.0, 0.0, 1.0, 0.0, 1.0,…) map to the real world problem. For instance, what does that first “1.0” in the results indicate?

I get that it’s a prediction of ‘true’ for diabetes…but to which patient is it predicting that—the first in the list? So then the second result, “0.0,” is the prediction for the second patient/row in the dataset?

Reply
- Jason Brownlee June 26, 2017 at 6:08 am #
  
  Remember the original file has 0 and 1 values in the final class column where 0 is no onset of diabetes and 1 is an onset of diabetes.
  
  We are predicting new values in this column.
  
  We are making predictions for special rows, we pass in their medical info and predict the onset of diabetes. We just happen to do this for a number of rows at a time.
  
  Reply
  - ami July 16, 2018 at 4:30 pm #
    
    hello jason
    
    i am getting this error while calculating the predictions.
    
    #calculate predictions
    
    predictions = model.predict(X)
    
    #round predictions
    
    rounded = [round(x) for x in predictions]
    
    print(rounded)
    
    —————————————————————————
    TypeError Traceback (most recent call last)
    in ()
    2 predictions = model.predict(X)
    3 #round predictions
    —-> 4 rounded = [round(x) for x in predictions]
    5 print(rounded)
    
    in (.0)
    2 predictions = model.predict(X)
    3 #round predictions
    —-> 4 rounded = [round(x) for x in predictions]
    5 print(rounded)
    
    TypeError: type numpy.ndarray doesn’t define __round__ method
  - Jason Brownlee July 17, 2018 at 6:09 am #
    
    Try removing the call to round().
Rachel June 28, 2017 at 8:28 pm #

Hi Jason,
Can I ask why you use the same data X you fit the model to do the prediction?

# Fit the model
model.fit(X, Y, epochs = 150, batch_size = 10, verbose = 2)

# calculate predictions
predictions = model.predict(X)

Rachel

Reply
- Jason Brownlee June 29, 2017 at 6:34 am #
  
  It is all I have at hand. X means data matrix.
  
  Replace X in predict() with Xprime or whatever you like.
  
  Reply
jitendra March 27, 2018 at 7:20 pm #

hii, how will i feed the input (8,125,96,0,0,0.0,0.232,54) to get our output.

predictions = model.predict(X)
i mean insead of X i want to get output of 8,125,96,0,0,0.0,0.232,54.

Reply
- Jason Brownlee March 28, 2018 at 6:24 am #
  
  Wrap your input in an array, n-columns with one row, then pass that to the model.
  
  Does that help?
  
  Reply
  - Roman October 5, 2018 at 11:22 pm #
    
    Hello, trying to use predictions on similar neural network but keep getting errors that input dimension has other shape.
    
    Can you say how array must look on exampled neural network?
  - Jason Brownlee October 6, 2018 at 5:45 am #
    
    For an MLP, data must be organized into a 2d array of samples x features

Anirban July 23, 2016 at 10:52 pm #

I am not able to get to the last epoch. Getting error before that:
Epoch 11/150
390/768 [==============>……………]Traceback (most recent call last):.6921

ValueError: I/O operation on closed file

I could resolve this by varying the epoch and batch size.

Now to predict a unknown value, i loaded a new dataset and used predict cmd as below :
dataset_test = numpy.loadtxt(“pima-indians-diabetes_test.csv”,delimiter=”,”) –has only one row

X = dataset_test[:,0:8]
model.predict(X)

But I am getting error :
X = dataset_test[:,0:8]

IndexError: too many indices for array

Can you help pls.

Thanks

Reply
- Jason Brownlee July 24, 2016 at 6:55 am #
  
  I see problems like this when you run from a notebook or from an IDE.
  
  Consider running examples from the console to ensure they work.
  
  Consider tuning off verbose output (verbose=0 in the call to fit()) to disable the progress bar.
  
  Reply
David Kluszczynski July 28, 2016 at 12:42 am #

Hi Jason!
Loved the tutorial! I have a question however.
Is there a way to save the weights to a file after the model is trained for uses, such as kaggle?
Thanks,
David

Reply
- Jason Brownlee July 28, 2016 at 5:47 am #
  
  Thanks David.
  
  You can save the network weights to file by calling model.save_weights(“model.h5”)
  
  You can learn more in this post:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  Reply
Alex Hopper July 29, 2016 at 5:45 am #

Hey, Jason! Thank you for the awesome tutorial! I’ve use your tutorial to learn about CNN. I have one question for you… Supposing I want to use Keras to classicate images and I have 3 or more classes to classify, How could my algorithm know about this classes? You know, I have to code what is a cat, a dog and a horse. Is there any way to code this? I’ve tried it:

target_names = [‘class 0(Cats)’, ‘class 1(Dogs)’, ‘class 2(Horse)’]
print(classification_report(np.argmax(Y_test,axis=1), y_pred,target_names=target_names))

But my results are not classifying correctly.

precision recall f1-score support
class 0(Cat) 0.00 0.00 0.00 17
class 1(Dog) 0.00 0.00 0.00 14
class 2(Horse) 0.99 1.00 0.99 2526

avg / total 0.98 0.99 0.98 2557

Reply
- Jason Brownlee July 29, 2016 at 6:41 am #
  
  Great question Alex.
  
  This is an example of a multi-class classification problem. You must use a one hot encoding on the output variable to be able to model it with a neural network and specify the number of classes as the number of outputs on the final layer of your network.
  
  I provide a tutorial with the famous iris dataset that has 3 output classes here:
  https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
  
  Reply
  - Alex Hopper August 1, 2016 at 1:22 am #
    
    Thank you.
    I’ll check it.
    
    Reply
    - Jason Brownlee August 1, 2016 at 6:25 am #
      
      No problem Alex.
      
      Reply
Anonymouse August 2, 2016 at 11:28 pm #

This was really useful, thank you

I’m using keras (with CNNs) for sentiment classification of documents and I’d like to improve the performance, but I’m completely at a loss when it comes to tuning the parameters in a non-arbitrary way. Could you maybe point me somewhere that will help me go about this in a more systematic fashion? There must be some heuristics or rules-of-thumb that could guide me.

Reply
- Jason Brownlee August 3, 2016 at 8:09 am #
  
  I have a tutorial coming out soon (next week) that provide lots of examples of tuning the hyperparameters of a neural network in Keras, but limited to MLPs.
  
  For CNNs, I would advise tuning the number of repeating layers (conv + max pool), the number of filters in repeating block, and the number and size of dense layers at the predicting part of your network. Also consider using some fixed layers from pre-trained models as the start of your network (e.g. VGG) and try just training some input and output layers around it for your problem.
  
  I hope that helps as a start.
  
  Reply
Shopon August 14, 2016 at 5:04 pm #

Hello Jason , My Accuracy is : 0.0104 , but yours is 0.7879 and my loss is : -9.5414 . Is there any problem with the dataset ? I downloaded the dataset from a different site .

Reply
- Jason Brownlee August 15, 2016 at 12:36 pm #
  
  I think there might be something wrong with your implementation or your dataset. Your numbers are way out.
  
  Reply
mohamed August 15, 2016 at 9:30 am #

after training, how i can use the trained model on new sample

Reply
- Jason Brownlee August 15, 2016 at 12:36 pm #
  
  You can call model.predict()
  
  See an above comment for a specific code example.
  
  Reply
Omachi Okolo August 16, 2016 at 10:21 pm #

Hi Jason,
i’m a student conducting a research on how to use artificial neural network to predict the business viability of potential software projects.
I intend to use python as a programming language. The application of ANN fascinates me but i’m new to machine learning and python. Can you help suggest how to go about this.
Many thanks

Reply
- Jason Brownlee August 17, 2016 at 9:51 am #
  
  Consider getting a good grounding in how to work through a machine learning problem end to end in python first.
  
  Here is a good tutorial to get you started:
  https://machinelearningmastery.com/machine-learning-in-python-step-by-step/
  
  Reply
Agni August 17, 2016 at 6:23 am #

Dear Jeson, this is a great tutorial for beginners. It will satisfy the need of many students who are looking for the initial help. But I have a question. Could you please light on a few things: i) how to test the trained model using test dataset (i.e., loading of test dataset and applied the model and suppose the test file name is test.csv) ii) print the accuracy obtained on test dataset iii) the o/p has more than 2 class (suppose 4-class classification problem).
Please show the whole program to overcome any confusion.
Thanks a lot.

Reply
- Jason Brownlee August 17, 2016 at 10:03 am #
  
  I provide an example elsewhere in the comments, you can also see how to make predictions on new data in this post:
  https://machinelearningmastery.com/5-step-life-cycle-neural-network-models-keras/
  
  For an example of multi-class classification, you can see this tutorial:
  https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
  
  Reply
Doron Vetlzer August 17, 2016 at 9:29 am #

I am trying to build a Neural Network with some recursive connections but not a full recursive layer, how do I do this in Keras?

Reply
- Doron Vetlzer August 17, 2016 at 9:31 am #
  
  I could print a diagram of the network but what I want Basically is that each neuron in the current time frame to know only its own previous output and not the output of all the neurons in the output layer.
  
  Reply
- Jason Brownlee August 17, 2016 at 10:04 am #
  
  I don’t know off hand Doron.
  
  Reply
  - Doron Veltzer August 23, 2016 at 2:28 am #
    
    Thanks for replying though, have a good day.
    
    Reply
sairam August 30, 2016 at 8:49 am #

Hello Jason,

This is a great tutorial . Thanks for sharing.

I am having a dataset of 100 finger prints and i want to extract minutiae of 100 finger prints using python ( Keras). Can you please advise where to start? I am really confused.

Reply
- Jason Brownlee August 31, 2016 at 8:43 am #
  
  If your fingerprints are images, you may want to consider using convolutional neural networks (CNNs) that are much better at working image data.
  
  See this tutorial on digit recognition for a start:
  https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
  
  Reply
  - padmashri July 6, 2017 at 10:12 pm #
    
    Hi Jason
    Thanks for this great tutorial, i am new to machine learning i went through your basic tutorial on keras and also handwritten-digit-recognition. I would like to understand how i can train a set of image data, for eg. the set of image data can be some thing like square, circle, pyramid.
    pl. let me know how the input data needs to fed to the program and how we need to export the model.
    
    Reply
    - Jason Brownlee July 9, 2017 at 10:30 am #
      
      Start by preparing a high-quality dataset.
      
      Reply
CM September 1, 2016 at 4:23 pm #

Hi Jason,

Thanks for the great article. But I had 1 query.

Are there any inbuilt functions in keras that can give me the feature importance for the ANN model?

If not, can you suggest a technique I can use to extract variable importance from the loss function? I am considering an approach similar to that used in RF which involves permuting the values of the selected variable and calculating the relative increase in loss.

Regards,
CM

Reply
- Jason Brownlee September 2, 2016 at 8:07 am #
  
  I don’t believe so CM.
  
  I would suggest using a wrapper method and evaluate subsets of features to develop a feature importance/feature selection report.
  
  I talk a lot more about feature selection in this post:
  https://machinelearningmastery.com/an-introduction-to-feature-selection/
  
  I provide an example of feature selection in scikit-learn here:
  https://machinelearningmastery.com/feature-selection-machine-learning-python/
  
  I hope that helps as a start.
  
  Reply
- Minesh Jethva May 15, 2017 at 7:49 pm #
  
  have you develop any progress for this approach? I also have same problem.
  
  Reply
Kamal September 7, 2016 at 2:09 am #

Dear Jason, I am new to Deep learning. Being a novice, I am asking you a technical question which may seem silly. My question is that- can we use features (for example length of the sentence etc.) of a sentence while classifying a sentence ( suppose the o/p are +ve sentence and -ve sentence) using deep neural network?

Reply
- Jason Brownlee September 7, 2016 at 10:27 am #
  
  Great question Kamal, yes you can. I would encourage you to include all such features and see which give you a bump in performance.
  
  Reply
Saurabh September 11, 2016 at 12:42 pm #

Hi, How would I use this on a dataset that has multiple outputs? For example a dataset with output A and B where A could be 0 or 1 and B could be 3 or 4 ?

Reply
- Jason Brownlee September 12, 2016 at 8:30 am #
  
  You could use two neurons in the output layer and normalize the output variables to both be in the range of 0 to 1.
  
  This tutorial on multi-class classification might give you some ideas:
  https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
  
  Reply
Tom_P September 17, 2016 at 1:47 pm #

Hi Jason,
The tutorial looks really good but unfortunately I keep getting an error when importing Dense from keras.layers, I get the error : AttributeError: module ‘theano’ has no attribute ‘gof’
I have tried reinstalling Theano but it has not fixed the issue.

Best wishes
Tom

Reply
- Jason Brownlee September 18, 2016 at 7:57 am #
  
  Hi Tom, sorry to hear that. I have not seen this problem before.
  
  Have you searched google? I can see a few posts and it might be related to your version of scipy or similar.
  
  Let me know how you go.
  
  Reply
shudhan September 21, 2016 at 5:54 pm #

Hey Jason,

Can you please make a tutorial on how to add additional train data into the already trained model? This will be helpful for the bigger data sets. I read that warm start is used for random forest. But not sure how to implement as algorithm. A generalised version of how to implement would be good. Thank You!

Reply
- Jason Brownlee September 22, 2016 at 8:08 am #
  
  Great question Shudhan!
  
  Yes, you could save your weights, load them later into a new network topology and start training on new data again.
  
  I’ll work out an example in coming weeks, time permitting.
  
  Reply
Joanna September 22, 2016 at 1:09 am #

Hi Jason,
first of all congratulations for this amazing work that you have done!
Here is my question:
What about if my .csv file includes also both nominal and numerical attributes?
Should I change my nominal values to numerical?

Thank you in advance

Reply
- Jason Brownlee September 22, 2016 at 8:19 am #
  
  Hi Joanna, yes.
  
  You can use a label encoder to convert nominal to integer, and then even convert the integer to one hot encoding.
  
  This post will give you code you can use:
  https://machinelearningmastery.com/data-preparation-gradient-boosting-xgboost-python/
  
  Reply
ATM October 2, 2016 at 5:47 am #

A small bug:-
Line 25 : rounded = [round(x) for x in predictions]

should have numpy.round instead, for the code to run!
Great tutorial, regardless. The best i’ve seen for intro to ANN in python. Thanks!

Reply
- Jason Brownlee October 2, 2016 at 8:20 am #
  
  Perhaps it’s your version of Python or environment?
  
  In Python 2.7 the round() function is built-in.
  
  Reply
  - AC January 14, 2017 at 2:11 am #
    
    If there is comment for python3, should be better.
    #use unmpy.round instead, if using python3,
    
    Reply
    - Jason Brownlee January 15, 2017 at 5:24 am #
      
      Thanks for the note AC.
      
      Reply
Ash October 9, 2016 at 1:36 am #

This is simple to grasp! Great post! How can we perform dropout in keras?

Reply
- Jason Brownlee October 9, 2016 at 6:49 am #
  
  Thanks Ash.
  
  You can learn about drop out with Keras here:
  https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/
  
  Reply
Homagni Saha October 14, 2016 at 4:15 am #

Hello Jason,
You are using model.predict in the end to predict the results. Is it possible to save the model somewhere in the harddisk and transfer it to another machine(turtlebot running on ROS for my instance) and then use the model directly on turtlebot to predict the results?
Please tell me how
Thanking you
Homagni Saha

Reply
- Jason Brownlee October 14, 2016 at 9:07 am #
  
  Hi Homagni, great question.
  
  Absolutely!
  
  Learn exactly how in this tutorial I wrote:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  Reply
Rimi October 16, 2016 at 8:21 pm #

Hi Jason,
I implemented you code to begin with. But I am getting an accuracy of 45.18% with the same parameters and everything.
Cant figure out why.
Thanks

Reply
- Jason Brownlee October 17, 2016 at 10:29 am #
  
  There does sound like a problem there Rimi.
  
  Confirm the code and data match exactly.
  
  Reply
Ankit October 26, 2016 at 8:12 pm #

Hi Jason,
I am little confused with first layer parameters. You said that first layer has 12 neurons and expects 8 input variables.

Why there is a difference between number of neurons, input_dim for first layer.

Regards,
Ankit

Reply
- Jason Brownlee October 27, 2016 at 7:45 am #
  
  Hi Ankit,
  
  The problem has 8 input variables and the first hidden layer has 12 neurons. Inputs are the columns of data, these are fixed. The Hidden layers in general are whatever we design based on whatever capacity we think we need to represent the complexity of the problem. In this case, we have chosen 12 neurons for the first hidden layer.
  
  I hope that is clearer.
  
  Reply
Tom October 27, 2016 at 3:04 am #

Hi,
I have a data , IRIS like data but with more colmuns.
I want to use MLP and DBN/CNNClassifier (or any other Deep Learning classificaiton algorithm) on my data to see how correctly it does classified into 6 groups.

Previously using DEEP LEARNING FOR J, today first time see KERAS.
does KERAS has examples (code examples) of DL Classification algorithms?

Kindly,
Tom

Reply
- Jason Brownlee October 27, 2016 at 7:48 am #
  
  Yes Tom, the example in this post is an example of a neural network (deep learning) applied to a classification problem.
  
  Reply
Rumesa October 30, 2016 at 1:57 am #

I have installed theano but it gives me the error of tensorflow.is it mendatory to install both packages? because tensorflow is not supported on wndows.the only way to get it on windows is to install virtual machine

Reply
- Jason Brownlee October 30, 2016 at 8:57 am #
  
  Keras will work just fine with Theano.
  
  Just install Theano, and configure Keras to use the Theano backend.
  
  More information about configuring the Keras backend here:
  https://machinelearningmastery.com/introduction-python-deep-learning-library-keras/
  
  Reply
  - Rumesa October 31, 2016 at 4:36 am #
    
    hey jason I have run your code but got the following error.Although I have aready installed theano backend.help me out.I just stuck.
    
    Using TensorFlow backend.
    Traceback (most recent call last):
    File “C:\Users\pc\Desktop\first.py”, line 2, in
    from keras.models import Sequential
    File “C:\Users\pc\Anaconda3\lib\site-packages\keras\__init__.py”, line 2, in
    from . import backend
    File “C:\Users\pc\Anaconda3\lib\site-packages\keras\backend\__init__.py”, line 64, in
    from .tensorflow_backend import *
    File “C:\Users\pc\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py”, line 1, in
    import tensorflow as tf
    ImportError: No module named ‘tensorflow’
    >>>
    
    Reply
    - Jason Brownlee October 31, 2016 at 5:34 am #
      
      Change the backend used by Keras from TensorFlow to Theano.
      
      You can do this either by using the command line switch or changing the Keras config file.
      
      See the link I posted in the previous post for instructions.
      
      Reply
- Maria January 6, 2017 at 1:05 pm #
  
  Hello Rumesa!
  Have you solved your problem? I have the same one. Everywhere is the same answer with keras.json file or envirinment variable but it doesn’t work. Can you tell me what have worked for you?
  
  Reply
  - Jason Brownlee January 7, 2017 at 8:20 am #
    
    Interesting.
    
    Maybe there is an issue with the latest version and a tight coupling to tensorflow? I have not seen this myself.
    
    Perhaps it might be worth testing prior versions of Keras, such as 1.1.0?
    
    Try this:
    
    pip install --upgrade --no-deps keras==1.1.0
    
    1
    
    pip install --upgrade --no-deps keras==1.1.0
    
    Reply
Alexon November 1, 2016 at 6:54 am #

Hi Jason,

First off, thanks so much for creating these resources, I have been keeping an eye on your newsletter for a while now, and I finally have the free time to start learning more about it myself, so your work has been really appreciated.

My question is: How can I set/get the weights of each hidden node?

I am planning to create several arrays randomized weights, then use a genetic algorithm to see which weight array performs the best and improve over generations. How would be the best way to go about this, and if I use a “relu” activation function, am I right in thinking these randomly generated weights should be between 0 and 0.05?

Many thanks for your help 🙂
Alexon

Reply
- Jason Brownlee November 1, 2016 at 8:05 am #
  
  Thanks Alexon,
  
  You can get and set the weights from a network.
  
  You can learn more about how to do this in the context of saving the weights to file here:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  I hope that helps as a start, I’d love to hear how you go.
  
  Reply
  - Alexon November 6, 2016 at 6:36 am #
    
    Thats great, thanks for pointing me in the right direction.
    I’d be happy to let you know how it goes, but might take a while as this is very much a “when I can find the time” project between jobs 🙂
    
    Cheers!
    
    Reply
Arnaldo Gunzi November 2, 2016 at 10:17 pm #

Nice introduction, thanks!

Reply
- Jason Brownlee November 3, 2016 at 7:59 am #
  
  I’m glad you found it useful Arnaldo.
  
  Reply
Abbey November 14, 2016 at 11:05 pm #

Good day

I have a question, how can I represent a character as a vector that could be an input for the neural network to predict the word meaning and trained using LSTM

For instance, I have bf to predict boy friend or best friend and similarly I have 2mor to predict tomorrow. I need to encode all the input as a character represented as vector, so that it can be train with RNN/LSTM to predict the output.

Thank you.

Kind Regards

Reply
- Jason Brownlee November 15, 2016 at 7:54 am #
  
  Hi Abbey, You can map characters to integers to get integer vectors.
  
  Reply
  - Abbey November 15, 2016 at 6:17 pm #
    
    Thank you Jason, if i map characters to integers value to get vectors using English Alphabets, numbers and special characters
    
    The question is how will LSTM predict the character. Please example in more details for me.
    
    Regards
    
    Reply
    - Jason Brownlee November 16, 2016 at 9:27 am #
      
      Hi Abbey,
      
      If your output values are also characters, you can map them onto integers, and reverse the mapping to convert the predictions back to text.
      
      Reply
      - Abbey November 16, 2016 at 8:39 pm #
        
        The output value of the characters encoding will be text
  - Abbey November 15, 2016 at 6:22 pm #
    
    Thank you, Jason, if I map characters to integers value to get vectors representation of the informal text using English Alphabets, numbers and special characters
    
    The question is how will LSTM predict the character or words that have close meaning to the input value. Please example in more details for me. I understand how RNN/LSTM work based on your tutorial example but the logic in designing processing is what I am stress with.
    
    Regards
    
    Reply
Ammar November 27, 2016 at 10:35 am #

hi Jason,
i am trying to implement CNN one dimention on my data. so, i bluit my network.
the issue is:
def train_model(model, X_train, y_train, X_test, y_test):
X_train = X_train.reshape(-1, 1, 41)
X_test = X_test.reshape(-1, 1, 41)

numpy.random.seed(seed)
model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=100, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print(“Accuracy: %.2f%%” % (scores[1] * 100))
this method above does not work and does not give me any error message.
could you help me with this please?

Reply
- Jason Brownlee November 28, 2016 at 8:40 am #
  
  Hi Ammar, I’m surprised that there is no error message.
  
  Perhaps run from the command line and add some print() statements to see exactly where it stops.
  
  Reply
KK November 28, 2016 at 6:55 pm #

Hi Jason
Great work. I have another doubt. How can we apply this to text mining. I have a csv file containing review document and label. I want to apply classify the documents based on the text available. Can U do this favor.

Reply
- Jason Brownlee November 29, 2016 at 8:48 am #
  
  I would recommend converting the chars to ints and then using an Embedding layer.
  
  Reply
Alex M November 30, 2016 at 10:52 pm #

Mr Jason, this is great tutorial but I am stack with some errors.

First I can’t load data set correctly, tried to correct error but can’t make it. ( FileNotFoundError: [Errno 2] No such file or directory: ‘pima-indians-diabetes.csv’ ).

Second: While trying to evaluate the model it says (X is not defined) May be this is because uploading failed.

Thanks!

Reply
- Jason Brownlee December 1, 2016 at 7:29 am #
  
  You need to download the file and place it in your current working directory Alex.
  
  Does that help?
  
  Reply
Alex M December 1, 2016 at 6:45 pm #

Sir, it is now successful….
Thanks!

Reply
- Jason Brownlee December 2, 2016 at 8:15 am #
  
  Glad to hear it Alex.
  
  Reply
Bappaditya December 2, 2016 at 7:35 pm #

Hi Jason,

First of all a special thanks to you for providing such a great tutorial. I am very new to machine learning and truly speaking i had no background in data science. The concept of ML overwhelmed me and now i have a desire to be an expert of this field. I need your advice to start from a scratch. Also i am a PhD student in Computer Engineering ( computer hardware )and i want to apply it as a tool for fault detection and testing for ICs.Can you provide me some references on this field?

Reply
- Jason Brownlee December 3, 2016 at 8:29 am #
  
  Hi Bappaditya,
  
  My best advice for getting started is here:
  https://machinelearningmastery.com/start-here/#getstarted
  
  I believe machine learning and deep learning are good tools for use on problems in fault detection. A good place to find references is here http://scholar.google.com
  
  Best of luck with your project.
  
  Reply
Alex M December 3, 2016 at 8:00 pm #

Well as usual in our daily coding life errors happen, now I have this error how can I correct it? Thanks!

” —————————————————————————
NoBackendError Traceback (most recent call last)
in ()
16 import librosa.display
17 audio_path = (‘/Users/MA/Python Notebook/OK.mp3’)
—> 18 y, sr = librosa.load(audio_path)

C:\Users\MA\Anaconda3\lib\site-packages\librosa\core\audio.py in load(path, sr, mono, offset, duration, dtype)
107
108 y = []
–> 109 with audioread.audio_open(os.path.realpath(path)) as input_file:
110 sr_native = input_file.samplerate
111 n_channels = input_file.channels

C:\Users\MA\Anaconda3\lib\site-packages\audioread\__init__.py in audio_open(path)
112
113 # All backends failed!
–> 114 raise NoBackendError()

NoBackendError:

”

That is the error I am getting just when trying to load a song into librosa…
Thanks!! @Jason Brownlee

Reply
- Jason Brownlee December 4, 2016 at 5:30 am #
  
  Sorry, this looks like an issue with your librosa library, not a machine learning issue. I can’t give you expert advice, sorry.
  
  Reply
Alex M December 4, 2016 at 10:30 pm #

Thanks I have managed to correct the error…

Happy Sunday to you all……

Reply
- Jason Brownlee December 5, 2016 at 6:49 am #
  
  Glad to hear it Alex.
  
  Reply
- ayush June 19, 2018 at 3:27 am #
  
  how did you solved the problem?
  
  Reply
Lei December 4, 2016 at 10:52 pm #

Hi, Jason, thank you for your amazing examples.
I run the same code on my laptop. But I did not get the same results. What could be the possible reasons?
I am using windows 8.1 64bit+eclipse+anaconda 4.2+theano 0.9.4+CUDA7.5
I got results like follows.

… …
Epoch 145/150

10/768 […………………………] – ETA: 0s – loss: 0.3634 – acc: 0.8000
80/768 [==>………………………] – ETA: 0s – loss: 0.4066 – acc: 0.7750
150/768 [====>…………………….] – ETA: 0s – loss: 0.4059 – acc: 0.8067
220/768 [=======>………………….] – ETA: 0s – loss: 0.4047 – acc: 0.8091
300/768 [==========>……………….] – ETA: 0s – loss: 0.4498 – acc: 0.7867
380/768 [=============>…………….] – ETA: 0s – loss: 0.4595 – acc: 0.7895
450/768 [================>………….] – ETA: 0s – loss: 0.4568 – acc: 0.7911
510/768 [==================>………..] – ETA: 0s – loss: 0.4553 – acc: 0.7882
580/768 [=====================>……..] – ETA: 0s – loss: 0.4677 – acc: 0.7776
660/768 [========================>…..] – ETA: 0s – loss: 0.4697 – acc: 0.7788
740/768 [===========================>..] – ETA: 0s – loss: 0.4611 – acc: 0.7838
768/768 [==============================] – 0s – loss: 0.4614 – acc: 0.7799
Epoch 146/150

10/768 […………………………] – ETA: 0s – loss: 0.3846 – acc: 0.8000
90/768 [==>………………………] – ETA: 0s – loss: 0.5079 – acc: 0.7444
170/768 [=====>……………………] – ETA: 0s – loss: 0.4500 – acc: 0.7882
250/768 [========>…………………] – ETA: 0s – loss: 0.4594 – acc: 0.7840
330/768 [===========>………………] – ETA: 0s – loss: 0.4574 – acc: 0.7818
400/768 [==============>……………] – ETA: 0s – loss: 0.4563 – acc: 0.7775
470/768 [=================>…………] – ETA: 0s – loss: 0.4654 – acc: 0.7723
540/768 [====================>………] – ETA: 0s – loss: 0.4537 – acc: 0.7870
620/768 [=======================>……] – ETA: 0s – loss: 0.4615 – acc: 0.7806
690/768 [=========================>….] – ETA: 0s – loss: 0.4631 – acc: 0.7739
750/768 [============================>.] – ETA: 0s – loss: 0.4649 – acc: 0.7733
768/768 [==============================] – 0s – loss: 0.4636 – acc: 0.7734
Epoch 147/150

10/768 […………………………] – ETA: 0s – loss: 0.3561 – acc: 0.9000
90/768 [==>………………………] – ETA: 0s – loss: 0.4167 – acc: 0.8556
170/768 [=====>……………………] – ETA: 0s – loss: 0.4824 – acc: 0.8059
250/768 [========>…………………] – ETA: 0s – loss: 0.4534 – acc: 0.8080
330/768 [===========>………………] – ETA: 0s – loss: 0.4679 – acc: 0.7848
400/768 [==============>……………] – ETA: 0s – loss: 0.4590 – acc: 0.7950
460/768 [================>………….] – ETA: 0s – loss: 0.4619 – acc: 0.7913
530/768 [===================>……….] – ETA: 0s – loss: 0.4562 – acc: 0.7868
600/768 [======================>…….] – ETA: 0s – loss: 0.4497 – acc: 0.7883
680/768 [=========================>….] – ETA: 0s – loss: 0.4525 – acc: 0.7853
760/768 [============================>.] – ETA: 0s – loss: 0.4568 – acc: 0.7803
768/768 [==============================] – 0s – loss: 0.4561 – acc: 0.7812
Epoch 148/150

10/768 […………………………] – ETA: 0s – loss: 0.4183 – acc: 0.9000
80/768 [==>………………………] – ETA: 0s – loss: 0.3674 – acc: 0.8750
160/768 [=====>……………………] – ETA: 0s – loss: 0.4340 – acc: 0.8250
240/768 [========>…………………] – ETA: 0s – loss: 0.4799 – acc: 0.7583
320/768 [===========>………………] – ETA: 0s – loss: 0.4648 – acc: 0.7719
400/768 [==============>……………] – ETA: 0s – loss: 0.4596 – acc: 0.7775
470/768 [=================>…………] – ETA: 0s – loss: 0.4475 – acc: 0.7809
540/768 [====================>………] – ETA: 0s – loss: 0.4545 – acc: 0.7778
620/768 [=======================>……] – ETA: 0s – loss: 0.4590 – acc: 0.7742
690/768 [=========================>….] – ETA: 0s – loss: 0.4769 – acc: 0.7652
760/768 [============================>.] – ETA: 0s – loss: 0.4748 – acc: 0.7658
768/768 [==============================] – 0s – loss: 0.4734 – acc: 0.7669
Epoch 149/150

10/768 […………………………] – ETA: 0s – loss: 0.3043 – acc: 0.9000
90/768 [==>………………………] – ETA: 0s – loss: 0.4913 – acc: 0.7111
170/768 [=====>……………………] – ETA: 0s – loss: 0.4779 – acc: 0.7588
250/768 [========>…………………] – ETA: 0s – loss: 0.4794 – acc: 0.7640
320/768 [===========>………………] – ETA: 0s – loss: 0.4957 – acc: 0.7562
370/768 [=============>…………….] – ETA: 0s – loss: 0.4891 – acc: 0.7703
450/768 [================>………….] – ETA: 0s – loss: 0.4737 – acc: 0.7867
520/768 [===================>……….] – ETA: 0s – loss: 0.4675 – acc: 0.7865
600/768 [======================>…….] – ETA: 0s – loss: 0.4668 – acc: 0.7833
680/768 [=========================>….] – ETA: 0s – loss: 0.4677 – acc: 0.7809
760/768 [============================>.] – ETA: 0s – loss: 0.4648 – acc: 0.7803
768/768 [==============================] – 0s – loss: 0.4625 – acc: 0.7826
Epoch 150/150

10/768 […………………………] – ETA: 0s – loss: 0.2751 – acc: 1.0000
100/768 [==>………………………] – ETA: 0s – loss: 0.4501 – acc: 0.8100
170/768 [=====>……………………] – ETA: 0s – loss: 0.4588 – acc: 0.8059
250/768 [========>…………………] – ETA: 0s – loss: 0.4299 – acc: 0.8200
310/768 [===========>………………] – ETA: 0s – loss: 0.4298 – acc: 0.8129
380/768 [=============>…………….] – ETA: 0s – loss: 0.4365 – acc: 0.8053
460/768 [================>………….] – ETA: 0s – loss: 0.4469 – acc: 0.7957
540/768 [====================>………] – ETA: 0s – loss: 0.4436 – acc: 0.8000
620/768 [=======================>……] – ETA: 0s – loss: 0.4570 – acc: 0.7871
690/768 [=========================>….] – ETA: 0s – loss: 0.4664 – acc: 0.7783
760/768 [============================>.] – ETA: 0s – loss: 0.4617 – acc: 0.7789
768/768 [==============================] – 0s – loss: 0.4638 – acc: 0.7773

32/768 [>………………………..] – ETA: 0s
448/768 [================>………….] – ETA: 0sacc: 79.69%

Reply
- Jason Brownlee December 5, 2016 at 6:50 am #
  
  There is randomness in the learning process that we cannot control for yet.
  
  See this post:
  https://machinelearningmastery.com/randomness-in-machine-learning/
  
  Reply
Nanya December 10, 2016 at 2:55 pm #

Hello Jason Brownlee,Thx for sharing~
I’m new in deep learning.And I am wondering can what you dicussed here:”Keras” be used to build a CNN in tensorflow and train some csv fiels for classification.May be this is a stupid question,but waiting for you reply.I’m working on my graduation project for Word sense disambiguation with cnn,and just can’t move on.Hope for your heip~Bese wishes!

Reply
- Jason Brownlee December 11, 2016 at 5:22 am #
  
  Sorry Nanya, I’m not sure I understand your question. Are you able to rephrase it?
  
  Reply
Anon December 16, 2016 at 12:51 am #

I’ve just installed Anaconda with Keras and am using python 3.5.
It seems there’s an error with the rounding using Py3 as opposed to Py2. I think it’s because of this change: https://github.com/numpy/numpy/issues/5700

I removed the rounding and just used print(predictions) and it seemed to work outputting floats instead.

Does this look correct?

…
Epoch 150/150
0s – loss: 0.4593 – acc: 0.7839
[[ 0.79361773]
[ 0.10443526]
[ 0.90862554]
…,
[ 0.33652252]
[ 0.63745886]
[ 0.11704451]]

Reply
- Jason Brownlee December 16, 2016 at 5:44 am #
  
  Nice, it does look good!
  
  Reply
Florin Claudiu Mihalache December 19, 2016 at 2:37 am #

Hi Jason Brownlee
I tried to modified your exemple for my problem (Letter Recognition ,http://archive.ics.uci.edu/ml/datasets/Letter+Recognition).
My data set look like http://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/letter-recognition.data (T,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8) .I try to split the data in input and ouput like this :

X = dataset[:,1:17]
Y = dataset[:,0]
but a have some error (something related that strings are not recognized) .
I tried to modified each letter whit the ASCII code (A became 65 and so on).The string error disappeared.
The program compiles now but the output look like this :

17445/20000 [=========================>….] – ETA: 0s – loss: -1219.4768 – acc:0.0000e+00
17605/20000 [=========================>….] – ETA: 0s – loss: -1219.4706 – acc:0.0000e+00
17730/20000 [=========================>….] – ETA: 0s – loss: -1219.4566 – acc:0.0000e+00
17890/20000 [=========================>….] – ETA: 0s – loss: -1219.4071 – acc:0.0000e+00
18050/20000 [==========================>…] – ETA: 0s – loss: -1219.4599 – acc:0.0000e+00
18175/20000 [==========================>…] – ETA: 0s – loss: -1219.3972 – acc:0.0000e+00
18335/20000 [==========================>…] – ETA: 0s – loss: -1219.4642 – acc:0.0000e+00
18495/20000 [==========================>…] – ETA: 0s – loss: -1219.5032 – acc:0.0000e+00
18620/20000 [==========================>…] – ETA: 0s – loss: -1219.4391 – acc:0.0000e+00
18780/20000 [===========================>..] – ETA: 0s – loss: -1219.5652 – acc:0.0000e+00
18940/20000 [===========================>..] – ETA: 0s – loss: -1219.5520 – acc:0.0000e+00
19080/20000 [===========================>..] – ETA: 0s – loss: -1219.5381 – acc:0.0000e+00
19225/20000 [===========================>..] – ETA: 0s – loss: -1219.5182 – acc:0.0000e+00
19385/20000 [============================>.] – ETA: 0s – loss: -1219.6742 – acc:0.0000e+00
19535/20000 [============================>.] – ETA: 0s – loss: -1219.7030 – acc:0.0000e+00
19670/20000 [============================>.] – ETA: 0s – loss: -1219.7634 – acc:0.0000e+00
19830/20000 [============================>.] – ETA: 0s – loss: -1219.8336 – acc:0.0000e+00
19990/20000 [============================>.] – ETA: 0s – loss: -1219.8532 – acc:0.0000e+00
20000/20000 [==============================] – 1s – loss: -1219.8594 – acc: 0.0000e+00
18880/20000 [===========================>..] – ETA: 0sacc: 0.00%

I do not understand why. Can you please help me

Reply
- Anon December 26, 2016 at 6:44 am #
  
  What version of Python are you running?
  
  Reply
karishma sharma December 22, 2016 at 10:03 am #

Hi Jason,

Since the epoch is set to 150 and batch size is 10, does the training algorithm pick 10 training examples at random in each iteration, given that we had only 768 total in X. Or does it sample randomly after it has finished covering all.

Thanks

Reply
- Jason Brownlee December 23, 2016 at 5:27 am #
  
  Good question,
  
  It iterates over the dataset 150 times and within one epoch it works through 10 rows at a time before doing an update to the weights. The patterns are shuffled before each epoch.
  
  I hope that helps.
  
  Reply
Kaustuv January 9, 2017 at 4:57 am #

Hi Jason
Thanks a lot for this blog. It really helps me to start learning deep learning which was in a planning state for last few months. Your simple enrich blogs are awsome. No questions from my side before completing all tutorials.
One question regarding availability of your book. How can I buy those books from India ?

Reply
- Jason Brownlee January 9, 2017 at 7:53 am #
  
  All my books and training are digital, you can purchase them from here:
  https://machinelearningmastery.com/products
  
  Reply
Stephen Wilson January 15, 2017 at 4:00 pm #

Hi Jason, firstly your work here is a fantastic resource and I am very thankful for the effort you put in.
I am a slightly-better-than-beginner at python and an absolute novice at ML, I wonder if you could help me classify my problem and find an angle to work at it from.

My data is thus:
Column Names: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, Result
Values: 4, 4, 6, 6, 3, 2, 5, 5, 0, 0, 0, 0, 0, 0, 0, 4

I want to find the percentage chance of each Column Names category being the Result based off the configuration of all the values present from 1-15. Then if need be compare the configuration of Values with another row of values to find the same, Resulting in the total needed calculation as:

Column Names: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, Result
Values: 4, 4, 6, 6, 3, 2, 5, 5, 0, 0, 0, 0, 0, 0, 0, 4
Values2: 7, 3, 5, 1, 4, 8, 6, 2, 9, 9, 9, 9, 9, 9, 9

I apologize if my explanation is not clear, and appreciate any help you can give me thank you.

Reply
- Jason Brownlee January 16, 2017 at 10:39 am #
  
  Hi Stephen,
  
  This process might help you work through your problem:
  https://machinelearningmastery.com/start-here/#process
  
  Specifically the first step in defining your problem.
  
  Let me know how you go.
  
  Reply
Rohit January 16, 2017 at 10:37 pm #

Thanks Jason for such a nice and concise example.

Just wanted to ask if it is possible to save this model in a file and port it to may be an Android or iOS device? If so, what are the libraries available for the same?

Thanks

Rohit

Reply
- Jason Brownlee January 17, 2017 at 7:38 am #
  
  Thanks Rohit,
  
  Here’s an example of saving a Keras model to file:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  I don’t know about running Keras on an Android or iOS device. Let me know how you go.
  
  Reply
  - zaheer khan June 16, 2017 at 7:17 pm #
    
    Dear Jason, Thanks for sharing this article.
    I am novice to the deep learning, and my apology if my question is not clear. my question is could we call all that functions and program from any .php,.aspx, or .html webpage. i mean i load the variables and other files selection from user interface and then make them input to this functions.
    
    will be waiting for your kind reply.
    thanks in advance.
    zaheer
    
    Reply
    - Jason Brownlee June 17, 2017 at 7:25 am #
      
      Perhaps, this sounds like a systems design question, not really machine learning.
      
      I would suggest you gather requirements, assess risks like any software engineering project.
      
      Reply
Hsiang January 18, 2017 at 3:35 pm #

Hi, Jason

Thank you for your blog! It is wonderful!

I used tensorflow as backend, and implemented the procedures using Jupyter.
I did “source activate tensorflow” -> “ipython notebook”.
I can successfully use Keras and import tensorflow.

However, it seems that such environment doesn’t support pandas and sklearn.
Do you have any way to incorporate pandas, sklearn and keras?
(I wish to use sklearn to revisit the classification problem and compare the accuracy with the deep learning method. But I also wish to put the works together in the same interface.)

Thanks!

Reply
- Jason Brownlee January 19, 2017 at 7:24 am #
  
  Sorry, I do not use notebooks myself. I cannot offer you good advice.
  
  Reply
  - Hsiang January 19, 2017 at 12:53 pm #
    
    Thanks, Jason!
    Actually the problem is not on notebooks. Even I used the terminal mode, i.e. doing “source activate tensorflow” only. It failed to import sklearn. Does that mean tensorflow library is not compatible with sklearn? Thanks again!
    
    Reply
    - Jason Brownlee January 20, 2017 at 10:17 am #
      
      Sorry Hsiang, I don’t have experience using sklearn and tensorflow with virtual environments.
      
      Reply
      - Hsiang January 21, 2017 at 12:46 am #
        
        Thank you!
      - Jason Brownlee January 21, 2017 at 10:34 am #
        
        You’re welcome Hsiang.
keshav bansal January 24, 2017 at 12:45 am #

hello sir,
A very informative post indeed . I know my question is a very trivial one but can you please show me how to predict on a explicitly mentioned data tuple say v=[6,148,72,35,0,33.6,0.627,50]
thanks for the tutorial anyway

Reply
- Jason Brownlee January 24, 2017 at 11:04 am #
  
  Hi keshav,
  
  You can make predictions by calling model.predict()
  
  Reply
CATRINA WEBB January 25, 2017 at 9:06 am #

When I rerun the file (without predictions) does it reset the model and weights?

Reply
Ericson January 30, 2017 at 8:04 pm #

excuse me sir, i wanna ask you a question about this paragraph”dataset = numpy.loadtxt(“pima-indians-diabetes.csv”,delimiter=’,’)”, i used the mac and downloaded the dataset,then i exchanged the text into csv file. Running the program

,hen i got:{Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 12:39:47)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type “copyright”, “credits” or “license()” for more information.
>>>
============ RESTART: /Users/luowenbin/Documents/database_test.py ============
Using TensorFlow backend.

Traceback (most recent call last):
File “/Users/luowenbin/Documents/database_test.py”, line 9, in
dataset = numpy.loadtxt(“pima-indians-diabetes.csv”,delimiter=’,’)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/npyio.py”, line 985, in loadtxt
items = [conv(val) for (conv, val) in zip(converters, vals)]
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/npyio.py”, line 687, in floatconv
return float(x)
ValueError: could not convert string to float: book
>>> }
How can i solve this problem? give me a hand thank you!

Reply
- Jason Brownlee February 1, 2017 at 10:22 am #
  
  Hi Ericson,
  
  Confirm that the contents of “pima-indians-diabetes.csv” meet your expectation of a list of CSV lines.
  
  Reply
Sukhpal February 7, 2017 at 9:00 pm #

excuse me sir,when i run this code for my data set ,I encounter this problem…please help me finding solution to this problem
runfile(‘C:/Users/sukhpal/.spyder/temp.py’, wdir=’C:/Users/sukhpal/.spyder’)
Using TensorFlow backend.
Traceback (most recent call last):

File “”, line 1, in
runfile(‘C:/Users/sukhpal/.spyder/temp.py’, wdir=’C:/Users/sukhpal/.spyder’)

File “C:\Users\sukhpal\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 866, in runfile
execfile(filename, namespace)

File “C:\Users\sukhpal\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 87, in execfile
exec(compile(scripttext, filename, ‘exec’), glob, loc)

File “C:/Users/sukhpal/.spyder/temp.py”, line 1, in
from keras.models import Sequential

File “C:\Users\sukhpal\Anaconda2\lib\site-packages\keras\__init__.py”, line 2, in
from . import backend

File “C:\Users\sukhpal\Anaconda2\lib\site-packages\keras\backend\__init__.py”, line 67, in
from .tensorflow_backend import *

File “C:\Users\sukhpal\Anaconda2\lib\site-packages\keras\backend\tensorflow_backend.py”, line 1, in
import tensorflow as tf

ImportError: No module named tensorflow

Reply
- Jason Brownlee February 8, 2017 at 9:34 am #
  
  This is a change with the most recent version of tensorflow, I will investigate and change the example.
  
  For now, consider installing and using an older version of tensorflow.
  
  Reply
Will February 14, 2017 at 5:33 am #

Great tutorial! Amazing amount of work you’ve put in and great marketing skills (I also have an email list, ebooks and sequence, etc). I ran this in Jupyter notebook… I noticed the 144th epoch (acc .7982) had more accuracy than at 150. Why is that?

P.S. i did this for the print: print(numpy.round(predictions))
It seems to avoid a list of arrays which when printing includes the dtype (messy)

Reply
- Jason Brownlee February 14, 2017 at 10:07 am #
  
  Thanks Will.
  
  The model will fluctuate in performance while learning. You can configure triggered check points to save the model if/when conditions like a decrease in train/validation performance is detected. Here’s an example:
  https://machinelearningmastery.com/check-point-deep-learning-models-keras/
  
  Reply
Sukhpal February 14, 2017 at 3:50 pm #

Please help me to find out this error
runfile(‘C:/Users/sukhpal/.spyder/temp.py’, wdir=’C:/Users/sukhpal/.spyder’)ERROR: execution aborted

Reply
- Jason Brownlee February 15, 2017 at 11:32 am #
  
  I’m not sure Sukhpal.
  
  Consider getting code working from the command line, I don’t use IDEs myself.
  
  Reply
Kamal February 14, 2017 at 5:15 pm #

please help me to find this error find this error
Epoch 194/195
195/195 [==============================] – 0s – loss: 0.2692 – acc: 0.8667
Epoch 195/195
195/195 [==============================] – 0s – loss: 0.2586 – acc: 0.8667
195/195 [==============================] – 0s
Traceback (most recent call last):

Reply
- Jason Brownlee February 15, 2017 at 11:32 am #
  
  What was the error exactly Kamal?
  
  Reply
Kamal February 15, 2017 at 3:24 pm #

sir when i run the code on my data set
then it doesnot show overall accuracy although it shows the accuracy and loss for the whole iterations

Reply
- Jason Brownlee February 16, 2017 at 11:06 am #
  
  I’m not sure I understand your question Kamal, please you could restate it?
  
  Reply
Val February 15, 2017 at 9:00 pm #

Hi Jason, im just starting deep learning in python using keras and theano. I have followed the installation instructions without a hitch. Tested some examples but when i run this one line by line i get a lot of exceptions and errors once i run the “model.fit(X,Y, nb_epochs=150, batch_size=10”

Reply
- Jason Brownlee February 16, 2017 at 11:06 am #
  
  What errors are you getting?
  
  Reply
CrisH February 17, 2017 at 8:12 pm #

Hi, how do I know what number to use for random.seed() ? I mean you use 7, is there any reason for that? Also is it enough to use it only once, in the beginning of the code?

Reply
- Jason Brownlee February 18, 2017 at 8:38 am #
  
  You can use any number CrisH. The fixed random seed makes the example reproducible.
  
  You can learn more about randomness and random seeds in this post:
  https://machinelearningmastery.com/randomness-in-machine-learning/
  
  Reply
kk February 18, 2017 at 1:53 am #

am new to deep learning and found this great tutorial. keep it up and look forward!!

Reply
- Jason Brownlee February 18, 2017 at 8:41 am #
  
  Thanks!
  
  Reply
Iqra Ameer February 21, 2017 at 5:20 am #

HI, I have a problem in execution the above example as it. It seems that it’s not running properly and stops at Using TensorFlow backend.

Epoch 147/150
768/768 [==============================] – 0s – loss: 0.4709 – acc: 0.7878
Epoch 148/150
768/768 [==============================] – 0s – loss: 0.4690 – acc: 0.7812
Epoch 149/150
768/768 [==============================] – 0s – loss: 0.4711 – acc: 0.7721
Epoch 150/150
768/768 [==============================] – 0s – loss: 0.4731 – acc: 0.7747
32/768 [>………………………..] – ETA: 0sacc: 76.43%

I am new in this field, could you please guide me about this error.
I also executed on another data set, it stops with the same behavior.

Reply
- Jason Brownlee February 21, 2017 at 9:39 am #
  
  What is the error exactly? The example hangs?
  
  Maybe try the Theano backend and see if that makes a difference. Also make sure all of your libraries are up to date.
  
  Reply
Iqra Ameer February 22, 2017 at 5:47 am #

Dear Jason,
Thank you so much for your valuable suggestions. I tried Theano backend and also updated all my libraries, but again it hanged at:

768/768 [==============================] – 0s – loss: 0.4656 – acc: 0.7799
Epoch 149/150
768/768 [==============================] – 0s – loss: 0.4589 – acc: 0.7826
Epoch 150/150
768/768 [==============================] – 0s – loss: 0.4611 – acc: 0.7773
32/768 [>………………………..] – ETA: 0sacc: 78.91%

Reply
- Jason Brownlee February 22, 2017 at 10:05 am #
  
  I’m sorry to hear that, I have not seen this issue before.
  
  Perhaps a RAM issue or a CPU overheating issue? Are you able to try different hardware?
  
  Reply
- frd March 8, 2017 at 2:50 am #
  
  Hi!
  
  Were you able to find a solution for that?
  
  I’m having exactly the same problem
  
  ( … )
  Epoch 149/150
  768/768 [==============================] – 0s – loss: 0.4593 – acc: 0.7773
  Epoch 150/150
  768/768 [==============================] – 0s – loss: 0.4586 – acc: 0.7891
  32/768 [>………………………..] – ETA: 0sacc: 76.69%
  
  Reply
Bhanu February 23, 2017 at 1:51 pm #

Hello sir,
i want to ask wether we can convert this code to deep learning wid increasing number of layers..

Reply
- Jason Brownlee February 24, 2017 at 10:12 am #
  
  Sure you can increase the number of layers, try it and see.
  
  Reply
Ananya Mohapatra February 28, 2017 at 6:40 pm #

hello sir,
could you please tell me how do i determine the no.of neurons in each layer, because i am using a different datset and am unable to know the no.of neurons in each layer

Reply
- Jason Brownlee March 1, 2017 at 8:33 am #
  
  Hi Ananya, great question.
  
  Sorry, there is no good theory on how to configure a neural net.
  
  You can configure the number of neurons in a layer by trial and error. Also consider tuning the number of epochs and batch size at the same time.
  
  Reply
  - Ananya Mohapatra March 1, 2017 at 4:42 pm #
    
    thank you so much sir. It worked ! 🙂
    
    Reply
    - Jason Brownlee March 2, 2017 at 8:11 am #
      
      Glad to here it Ananya.
      
      Reply
Jayant Sahewal February 28, 2017 at 8:11 pm #

Hi Jason,

really helpful blog. I have a question about how much time does it take to converge?

I have a dataset with around 4000 records, 3 input columns and 1 output column. I came up with the following model

def create_model(dropout_rate=0.0, weight_constraint=0, learning_rate=0.001, activation=’linear’):
# create model
model = Sequential()
model.add(Dense(6, input_dim=3, init=’uniform’, activation=activation, W_constraint=maxnorm(weight_constraint)))
model.add(Dropout(dropout_rate))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))
# Optimizer
optimizer = Adam(lr=learning_rate)
# Compile model
model.compile(loss=’binary_crossentropy’, optimizer=optimizer, metrics=[‘accuracy’])
return model

# create model
model = KerasRegressor(build_fn=create_model, verbose=0)
# define the grid search parameters
batch_size = [10]
epochs = [100]
weight_constraint = [3]
dropout_rate = [0.9]
learning_rate = [0.01]
activation = [‘linear’]
param_grid = dict(batch_size=batch_size, nb_epoch=epochs, dropout_rate=dropout_rate, \
weight_constraint=weight_constraint, learning_rate=learning_rate, activation=activation)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=5)
grid_result = grid.fit(X_train, Y_train)

I have a 32 core machine with 64 GB RAM and it does not converge even in more than an hour. I can see all the cores busy, so it is using all the cores for training. However, if I change the input neurons to 3 then it converges in around 2 minutes.

Keras version: 1.1.1
Tensorflow version: 0.10.0rc0
theano version: 0.8.2.dev-901275534cbfe3fbbe290ce85d1abf8bb9a5b203

It’s using Tensorflow backend. Can you help me understand what is going on or point me in the right direction? Do you think switching to theano will help?

Best,
Jayant

Reply
- Jason Brownlee March 1, 2017 at 8:36 am #
  
  This post might help you tune your deep learning model:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  I hope that helps as a start.
  
  Reply
Animesh Mohanty March 1, 2017 at 9:21 pm #

hello sir,
could you please tell me how can i plot the results of the code on a graph . I made a few adjustments to the code so as to run it on a different dataset.

Reply
- Jason Brownlee March 2, 2017 at 8:16 am #
  
  What do you want to plot exactly Animesh?
  
  Reply
  - Animesh Mohanty March 2, 2017 at 4:56 pm #
    
    Accuracy vs no.of neurons in the input layer and the no.of neurons in the hidden layer
    
    Reply
param March 2, 2017 at 12:15 am #

sir can u plz explain
the different attributes used in this statement
print(“%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))

Reply
- param March 2, 2017 at 12:16 am #
  
  precisely,what is model.metrics_names
  
  Reply
  - Jason Brownlee March 2, 2017 at 8:22 am #
    
    model.metrics_names is a list of names of the metrics collected during training.
    
    More details here:
    https://keras.io/models/sequential/
    
    Reply
- Jason Brownlee March 2, 2017 at 8:20 am #
  
  Hi param,
  
  It is using string formatting. %s formats a string, %.2f formats a floating point value with 2 decimal places, %% includes a percent symbol.
  
  You can learn more about the print function here:
  https://docs.python.org/3/library/functions.html#print
  
  More info on string formatting here:
  https://pyformat.info/
  
  Reply
Vijin K P March 2, 2017 at 4:01 am #

Hi Jason,

It was an awesome post. Could you please tell me how to we decide the following in a DNN 1. number of neurons in the hidden layers
2. number of hidden layers

Thanks.
Vijin

Reply
- Jason Brownlee March 2, 2017 at 8:22 am #
  
  Great question Vijin.
  
  Generally, trial and error. There are no good theories on how to configure a neural network.
  
  Reply
  - Vijin K P March 3, 2017 at 5:23 am #
    
    We do cross validation, grid search etc to find the hyper parameters in machine algorithms. Similarly can we do anything to identify the above parameters??
    
    Reply
    - Jason Brownlee March 3, 2017 at 7:46 am #
      
      Yes, we can use grid search and tuning for neural nets.
      
      The stochastic nature of neural nets means that each experiment (set of configs) will have to be run many times (30? 100?) so that you can take the mean performance.
      
      More general info on tuning neural nets here:
      https://machinelearningmastery.com/improve-deep-learning-performance/
      
      More on randomness and stochastic algorithms here:
      https://machinelearningmastery.com/randomness-in-machine-learning/
      
      Reply
Bogdan March 2, 2017 at 11:48 pm #

Jason, Please tell me about these lines in your code:

seed = 7
numpy.random.seed(seed)

What do they do? And why do they do it?

One more question is why do you call the last section Bonus:Make a prediction?
I thought this what ANN was created for. What the point if your network’s output is just what you have already know?

Reply
- Jason Brownlee March 3, 2017 at 7:44 am #
  
  They seed the random number generator so that it produces the same sequence of random numbers each time the code is run. This is to ensure you get the same result as me.
  
  I’m not convinced it works with Keras though.
  
  More on randomness in machine learning here:
  https://machinelearningmastery.com/randomness-in-machine-learning/
  
  I was showing how to build and evaluate the model in this tutorial. The part about standalone prediction was an add-on.
  
  Reply
Sounak sahoo March 3, 2017 at 7:39 pm #

what exactly is the work of “seed” in the neural network code? what does it do?

Reply
- Jason Brownlee March 6, 2017 at 10:44 am #
  
  Seed refers to seeding the random number generator so that the same sequence of random numbers is generated each time the example is run.
  
  The aim is to make the examples 100% reproducible, but this is hard with symbolic math libs like Theano and TensorFlow backends.
  
  For more on randomness in machine learning, see this post:
  https://machinelearningmastery.com/randomness-in-machine-learning/
  
  Reply
Priya Sundari March 3, 2017 at 10:19 pm #

hello sir
could you plz tell me what is the role of optimizer and binary_crossentropy exactly? it is written that optimizer is used to search through the weights of the network which weights are we talking about exactly?

Reply
- Jason Brownlee March 6, 2017 at 10:48 am #
  
  Hi Priya,
  
  You can learn more about the fundamentals of neural nets here:
  https://machinelearningmastery.com/neural-networks-crash-course/
  
  Reply
Bogdan March 3, 2017 at 10:23 pm #

If I am not mistaken, those lines I commented about used when we write

init = ‘uniform’

?

Reply
Bogdan March 3, 2017 at 10:44 pm #

Could you explain in more details what is the batch size?

Reply
- Jason Brownlee March 6, 2017 at 10:50 am #
  
  Hi Bogdan,
  
  Batch size is how many patterns to show to the network before the weights are updated with the accumulated errors. The smaller the batch, the faster the learning, but also the more noisy the learning (higher variance).
  
  Try exploring different batch sizes and see the effect on the train and test performance over each epoch.
  
  Reply
Mohammad March 7, 2017 at 6:50 am #

Dear Jason
Firstly, thanks for your great tutorials.
I am trying to classify computer networks packets using first 500 bytes of every packet to identify its protocol. I am trying to use 1d convolution. for simpler task,I just want to do binary classification and then tackle multilabel classification for 10 protocols. Here is my code but the accuracy which is like .63. how can I improve the performance? should I Use RNNs?
########
model=Sequential()
model.add(Convolution1D(64,10,border_mode=’valid’,
activation=’relu’,subsample_length=1, input_shape=(500, 1)))
#model.add(Convolution2D(32,5,5,border_mode=’valid’,input_shape=(1,28,28),))
model.add(MaxPooling1D(2))
model.add(Flatten())
model.add(Dense(200,activation=’relu’))
model.add(Dense(1,activation=’sigmoid’))
model.compile(loss=’binary_crossentropy’,
optimizer=’adam’,metrics=[‘accuracy’])
model.fit(train_set, y_train,
batch_size=250,
nb_epoch=30,
show_accuracy=True)
#x2= get_activations(model, 0,xprim )
#score = model.evaluate(t, y_test, show_accuracy = True, verbose = 0)
#print(score[0])

Reply
- Jason Brownlee March 7, 2017 at 9:37 am #
  
  This post lists some ideas to try an lift performance:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Damiano March 7, 2017 at 10:13 pm #

Hi Jason, thank you so much for this awesome tutorial. I have just started with python and machine learning.
I am joking with the code doing few changes, for example i have changed..

this:

# create model
model = Sequential()
model.add(Dense(250, input_dim=8, init=’uniform’, activation=’relu’))
model.add(Dense(200, init=’uniform’, activation=’relu’))
model.add(Dense(200, init=’uniform’, activation=’relu’))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))

and this:

model.fit(X, Y, nb_epoch=250, batch_size=10)

then i would like to pass some arrays for prediction so…

new_input = numpy.array([[3,88,58,11,54,24.8,267,22],[6,92,92,0,0,19.9,188,28], [10,101,76,48,180,32.9,171,63], [2,122,70,27,0,36.8,0.34,27], [5,121,72,23,112,26.2,245,30]])

predictions = model.predict(new_input)
print predictions # [1.0, 1.0, 1.0, 0.0, 1.0]

is this correct? In this example i used the same series of training (that have 0 class), but i am getting wrong results. Only one array is correctly predicted.

Thank you so much!

Reply
- Jason Brownlee March 8, 2017 at 9:41 am #
  
  Looks good. Perhaps you could try changing the configuration of your model to make it more skillful?
  
  See this post:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
ANJI March 13, 2017 at 8:48 pm #

hello sir,
could you please tell me to rectify my error below it is raised while model is training:

str(array.shape))
ValueError: Error when checking model input: expected convolution2d_input_1 to have 4 dimensions, but got array with shape (68, 28, 28).

Reply
- Jason Brownlee March 14, 2017 at 8:17 am #
  
  It looks like you are working with CNN, not related to this tutorial.
  
  Consider trying this tutorial to get familiar with CNNs:
  https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
  
  Reply
Rimjhim March 14, 2017 at 8:21 pm #

I want a neural that can predict sin values. Further from a given data set i need to determine the function(for example if the data is of tan or cos, then how to determine that data is of tan only or cos only)

Thanks in advance

Reply
Sudarshan March 15, 2017 at 11:19 pm #

Keras just updated to Keras 2.0. I have an updated version of this code here: https://github.com/sudarshan85/keras-projects/tree/master/mlm/pima_indians

Reply
- Jason Brownlee March 16, 2017 at 7:59 am #
  
  Nice work.
  
  Reply
subhasish March 16, 2017 at 5:09 pm #

hello sir,
can we use PSO (particle swarm optimisation) in this? if so can you tell how?

Reply
- Jason Brownlee March 17, 2017 at 8:25 am #
  
  Sorry, I don’t have an example of PSO for fitting neural network weights.
  
  Reply
Ananya Mohapatra March 16, 2017 at 10:03 pm #

hello sir,
what type of neural network is used in this code? as there are 3 types of Neural network that are… feedforward, radial basis function and recurrent neurak network.

Reply
- Jason Brownlee March 17, 2017 at 8:28 am #
  
  A multilayer perceptron (MLP) neural network. A classic type from the 1980s.
  
  Reply
Diego March 17, 2017 at 3:58 am #

got this error while compiling..

sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’

Reply
- Jason Brownlee March 17, 2017 at 8:30 am #
  
  Perhaps confirm that your libraries are all up to date (Keras, Theano or TensorFlow)?
  
  Reply
Rohan March 20, 2017 at 5:20 am #

Hi Jason!

I am trying to use two odd frames of a video to predict the even one. Thus I need to give two images as input to the network and get one image as output. Can you help me with the syntax for the first model.add()? I have X_train of dimension (190, 2, 240, 320, 3) where 190 are the number of odd pairs, 2 are the two odd images, and (240,320,3) are the (height, width, depth) of each image.

Reply
Herli Menezes March 21, 2017 at 8:33 am #

Hello, Jason,
Thanks for your good tutorial. However i found some issues:
Warnings like these:

1 – Warning (from warnings module):
File “/usr/lib/python2.7/site-packages/keras/legacy/interfaces.py”, line 86
‘call to the Keras 2 API: ' + signature) UserWarning: Update yourDense call to the Keras 2 API: Dense(12, activation=”relu”, kernel_initializer=”uniform”, input_dim=8)
2 - Warning (from warnings module): File "/usr/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 86 ' call to the Keras 2 API: ‘ + signature)
UserWarning: Update your Dense call to the Keras 2 API: Dense(8, activation="relu", kernel_initializer="uniform")

3 – Warning (from warnings module):
File “/usr/lib/python2.7/site-packages/keras/legacy/interfaces.py”, line 86
‘call to the Keras 2 API: ' + signature) UserWarning: Update yourDense call to the Keras 2 API: Dense(1, activation=”sigmoid”, kernel_initializer=”uniform”)
3 - Warning (from warnings module): File "/usr/lib/python2.7/site-packages/keras/models.py", line 826 warnings.warn('Thenb_epoch argument in fit' UserWarning: Thenb_epoch argument in fit has been renamed epochs`.

I think these are due to some package update..

But, the output of predictions was an array of zeros…
such as: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ….0.0]

I am running in a Linux Machine, Fedora 24,
Python 2.7.13 (default, Jan 12 2017, 17:59:37)
[GCC 6.3.1 20161221 (Red Hat 6.3.1-1)] on linux2

Why?

Thank you!

Reply
- Jason Brownlee March 21, 2017 at 8:45 am #
  
  These look like warnings related to the recent Keras 2.0 release.
  
  They look like just warning and that you can still run the example.
  
  I do not know why you are getting all zeros. I will investigate.
  
  Reply
Ananya Mohapatra March 21, 2017 at 6:21 pm #

hello sir,
can you please help me build a recurrent neural network with the above given dataset. i am having a bit trouble in building the layers…

Reply
- Jason Brownlee March 22, 2017 at 7:56 am #
  
  Hi Ananya ,
  
  The Pima Indian diabetes dataset is a binary classification problem. It is not appropriate for a Recurrent Neural Network as there is no sequence information to learn.
  
  Reply
  - Ananya Mohapatra March 22, 2017 at 8:04 pm #
    
    sir so could you tell on which type of dataset would the recurrent neural network accurately work? i have the dataset of EEG signals of epileptic patients…will recurrent network work on this?
    
    Reply
    - Jason Brownlee March 23, 2017 at 8:49 am #
      
      It may if it is regular enough.
      
      LSTMs are excellent at sequence problems that have regularity or clear signals to detect.
      
      Reply
Shane March 22, 2017 at 5:18 am #

Hi Jason, I have a quick question related to an error I am receiving when running the code in the tutorial…

When I run

# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

Python returns the following error:

sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’

Reply
- Jason Brownlee March 22, 2017 at 8:09 am #
  
  Sorry, I have not seen this error Shane.
  
  Perhaps check that your environment is up to date with the latest versions of the deep learning libraries?
  
  Reply
Tejes March 24, 2017 at 1:04 am #

Hi Jason,
Thanks for this awesome post.
I ran your code with tensorflow back end, just out of curiosity. The accuracy returned was different every time I ran the code. That didn’t happen with Theano. Can you tell me why?

Thanks in advance!

Reply
- Jason Brownlee March 24, 2017 at 7:56 am #
  
  You will get different accuracy each time you run the code because neural networks are stochastic.
  
  This is not related to the backend (I expect).
  
  More on randomness in machine learning here:
  https://machinelearningmastery.com/randomness-in-machine-learning/
  
  Reply
Saurabh Bhagvatula March 27, 2017 at 9:49 pm #

Hi Jason,
I’m new to deep learning and learning it from your tutorials, which previously helped me understand Machine Learning very well.
In the following code, I want to know why the number of neurons differ from input_dim in first layer of Nueral Net.
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, init=’uniform’, activation=’relu’))
model.add(Dense(8, init=’uniform’, activation=’relu’))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))

Reply
- Jason Brownlee March 28, 2017 at 8:22 am #
  
  You can specify the number of inputs via “input_dim”, you can specify the number of neurons in the first hidden layer as the first parameter to Dense().
  
  Reply
  - Saurabh Bhagvatula March 28, 2017 at 4:15 pm #
    
    Thanx a lot.
    
    Reply
    - Jason Brownlee March 29, 2017 at 9:05 am #
      
      You’re welcome.
      
      Reply
Nalini March 29, 2017 at 2:52 am #

Hi Jason

while running this code for k fold cross validation it is not working.please give the code for k fold cross validation in binary class

Reply
- Jason Brownlee March 29, 2017 at 9:10 am #
  
  Generally neural nets are too slow/large for k-fold cross validation.
  
  Nevertheless, you can use a sklearn wrapper for a keras model and use it with any sklearn resampling method:
  https://machinelearningmastery.com/evaluate-performance-machine-learning-algorithms-python-using-resampling/
  
  Reply
trangtruong March 29, 2017 at 7:04 pm #

Hi Jason, why i use function evaluate to get accuracy score my model with test dataset, it return result >1, i can’t understand.

Reply
enixon April 3, 2017 at 3:08 am #

Hey Jason, thanks for this great article! I get the following error when running the code above:

TypeError: Received unknown keyword arguments: {‘epochs’: 150}

Any ideas on why that might be? I can’t get ‘epochs’, nb_epochs, etc to work…

Reply
- Jason Brownlee April 4, 2017 at 9:07 am #
  
  You need to update to Keras version 2.0 or higher.
  
  Reply
Ananya Mohapatra April 5, 2017 at 9:30 pm #

def baseline_model():
# create model
model = Sequential()
model.add(Dense(10, input_dim=25, init=’normal’, activation=’softplus’))
model.add(Dense(3, init=’normal’, activation=’softmax’))
# Compile model
model.compile(loss=’mean_squared_error’, optimizer=’adam’, metrics=[‘accuracy’])
return model
sir here mean_square_error has been used for loss calculation. Is it the same as LMS algorithm. If not, can we use LMS , NLMS or RLS to calculate the loss?

Reply
Ahmad Hijazi April 5, 2017 at 10:19 pm #

Hello Jason, thank you a lot for this example.

My question is, after I trained the model and an accuracy of 79.2% for example is obtained successfully, how can I test this model on new data?

for example if a new patient with new records appear, I want to guess the result (0 or 1) for him, how can I do that in the code?

Reply
- Jason Brownlee April 9, 2017 at 2:36 pm #
  
  You can fit your model on all available training data then make predictions on new data as follows:
  
  yhat = model.predict(X)
  
  1
  
  yhat = model.predict(X)
  
  Reply
Perick Flaus April 6, 2017 at 12:16 am #

Thanks Jason, how can we test if new patient will be diabetic or no (0 or 1) ?

Reply
- Jason Brownlee April 9, 2017 at 2:36 pm #
  
  Fit the model on all training data and call:
  
  yhat = model.predict(X)
  
  1
  
  yhat = model.predict(X)
  
  Reply
Gangadhar April 12, 2017 at 1:28 am #

Dr Jason,

In compiling the model i got below error

TypeError: compile() got an unexpected keyword argument ‘metrics’

unable to resolve the below error

Reply
- Jason Brownlee April 12, 2017 at 7:53 am #
  
  Ensure you have the latest version of Keras, v2.0 or higher.
  
  Reply
Omogbehin Azeez April 13, 2017 at 1:48 am #

Hello sir,
Thank you for the post. A quick question, my dataset has 24 input and 1 binary output( 170 instances, 100 epoch , hidden layer=6 and 10 batch, kernel_initializer=’normal’) . I adapted your code using Tensor flow and keras. I am having an accuracy of 98 to 100 percent. I am scared of over-fitting in my model. I need your candid advice. Kind regards sir

Reply
- Jason Brownlee April 13, 2017 at 10:07 am #
  
  Yes, evaluate your model using k-fold cross-validation to ensure you are not tricking yourself.
  
  Reply
  - Omogbehin Azeez April 14, 2017 at 1:08 am #
    
    Thank you sir
    
    Reply
Sethu Baktha April 13, 2017 at 5:19 am #

Hi Jason,
If I want to use the diabetes dataset (NOT Pima) https://archive.ics.uci.edu/ml/datasets/Diabetes to predict Blood Glucose which tutorials and e-books of yours would I need to start with…. Also, the data in its current format with time, code and value is it usable as is or do I need to convert the data in another format to be able to use it.

Thanks for your help

Reply
- Jason Brownlee April 13, 2017 at 10:13 am #
  
  This process will help you frame and work through your dataset:
  https://machinelearningmastery.com/start-here/#process
  
  I hope that helps as a start.
  
  Reply
  - Sethu Baktha April 13, 2017 at 10:25 am #
    
    Dr. Jason,
    The data is time series(time based data) with categorical(20) with two numbers one for insulin level and another for blood sugar level… Each time series data does not have every categorical data… For example one category is blood sugar before breakfast, another category is blood sugar after breakfast, before lunch and after lunch… Some times some of these category data is missing… I read through the above link, but does not talk about time series, categorical data with some category of data missing what to do in those cases…. Please let me know if any of your books will help clarify these points?
    
    Reply
    - Jason Brownlee April 14, 2017 at 8:43 am #
      
      Hi Sethu,
      
      I have many posts on time series that will help. Get started here:
      https://machinelearningmastery.com/start-here/#timeseries
      
      With categorical data, I would recommend an integer encoding perhaps followed by a one-hot encoding. You can learn more about these encodings here:
      https://machinelearningmastery.com/data-preparation-gradient-boosting-xgboost-python/
      
      I hope that helps.
      
      Reply
Omogbehin Azeez April 14, 2017 at 9:49 am #

Hello sir,

Is it compulsory to normalize the data before using ANN model. I read it somewhere I which the author insisted that each attribute be comparable on the scale of [0,1] for a meaningful model. What is your take on that sir. Kind regards.

Reply
- Jason Brownlee April 15, 2017 at 9:29 am #
  
  Yes. You must scale your data to the bounds of the activation used.
  
  Reply
shiva April 14, 2017 at 10:38 am #

Hi Jason, You are simply awesome. I’m one of the many who got benefited from your book “machine learning mastery with python”. I’m working with a medical image classification problem. I have two classes of medical images (each class having 1000 images of 32*32) to be worked upon by the convolutional neural networks. Could you guide me how to load this data to the keras dataset? Or how to use my data while following your simple steps? kindly help.

Reply
- Jason Brownlee April 15, 2017 at 9:30 am #
  
  Load the data as numpy arrays and then you can use it with Keras.
  
  Reply
Omogbehin Azeez April 18, 2017 at 12:09 am #

Hello sir,

I adapted your code with the cross validation pipelined with ANN (Keras) for my model. It gave me 100% still. I got the data from UCI ( Chronic Kidney Disease). It was 400 instances, 24 input attributes and 1 binary attribute. When I removed the rows with missing data I was left with 170 instances. Is my dataset too small for (24 input layer, 24 hidden layer and 1 output layer ANN, using adam and kernel initializer as uniform )?

Reply
- Jason Brownlee April 18, 2017 at 8:32 am #
  
  It is not too small.
  
  Generally, the size of the training dataset really depends on how you intend to use the model.
  
  Reply
  - Omogbehin Azeez April 18, 2017 at 11:10 pm #
    
    Thank you sir for the response, I guess I have to contend with the over-fitting of my model.
    
    Reply
Padmanabhan Krishnamurthy April 19, 2017 at 6:26 pm #

Hi Jason,

Great tutorial. Love the site 🙂
Just a quick query : why have you used adam as an optimizer over sgd? Moreover, when do we use sgd optimization, and what exactly does it involve?

Thanks

Reply
- Jason Brownlee April 20, 2017 at 9:23 am #
  
  ADAM seems to consistently work well with little or no customization.
  
  SGD requires configuration of at least the learning rate and momentum.
  
  Try a few methods and use the one that works best for your problem.
  
  Reply
  - Padmanabhan Krishnamurthy April 20, 2017 at 4:32 pm #
    
    Thanks 🙂
    
    Reply
Omogbehin Azeez April 25, 2017 at 8:13 am #

Hello sir,

Good day sir, how can I get all the weights and biases of the keras ANN. Kind regards.

Reply
- Jason Brownlee April 26, 2017 at 6:19 am #
  
  You can save the network weights, see this post:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  You can also use the API to access the weights directly.
  
  Reply
Shiva April 27, 2017 at 5:43 am #

Hi Jason,
I am currently working with the IMDB sentiment analysis problem as mentioned in your book. Am using Anaconda 3 with Python 3.5.2. In an attempt to summarize the review length as you have mentioned in your book, When i try to execute the command:

result = map(len, X)
print(“Mean %.2f words (%f)” % (numpy.mean(result), numpy.std(result)))

it returns the error: unsupported operand type(s) for /: ‘map’ and ‘int’

kindly help with the modified syntax. looking forward…

Reply
- Jason Brownlee April 27, 2017 at 8:47 am #
  
  I’m sorry to hear that. Perhaps comment out that line?
  Or change it to remove the formatting and just print the raw mean and stdev values for you to review?
  
  Reply
Elikplim May 1, 2017 at 1:58 am #

Hello, quite new to Python, Numpy and Keras(background in PHP, MYSQL etc). If there are 8 input variables and 1 output varable(9 total), and the Array indexing starts from zero(from what I’ve gathered it’s a Numpy Array, which is built on Python lists) and the order is [rows, columns], then shouldn’t our input variable(X) be X = dataset[:,0:7] (where we select from the 1st to 8th columns, ie. 0th to 7th indices) and output variable(Y) be Y = dataset[:,8] (where we the 9th column, ie. 8th index)?

Reply
- Jason Brownlee May 1, 2017 at 5:59 am #
  
  You can learn more about array indexing in numpy here:
  https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
  
  Reply
Jackie Lee May 1, 2017 at 12:47 pm #

I’m having troubles with the predictions part. It saves ValueError: Error when checking model input: expected dense_1_input to have shape (None, 502) but got array with shape (170464, 502)

### MAKE PREDICTIONS ###
testset = numpy.loadtxt(“right_stim_FD1.csv”, delimiter=”,”)
A = testset[:,0:502]
B = testset[:,502]
probabilities = model.predict(A, batch_size=10, verbose=1)
predictions = float(round(a) for a in probabilities)
accuracy = numpy.mean(predictions == B)
#round predictions
#rounded = [round(x[0]) for x in predictions]
print(predictions)
print(“Prediction Accuracy: %.2f%%” % (accuracy*100))

Reply
- Jason Brownlee May 2, 2017 at 5:55 am #
  
  It looks like you might be giving the entire dataset as the output (y) rather than just the output variable.
  
  Reply
Anastasios Selalmazidis May 2, 2017 at 12:27 am #

Hi there,

I have a question regarding deep learning. In this tutorial we build a MLP with Keras. Is this Deep Learning or is it just a MLP Backpropagation ?

Reply
- Jason Brownlee May 2, 2017 at 5:59 am #
  
  Deep learning is MLP backprop these days:
  https://machinelearningmastery.com/what-is-deep-learning/
  
  Generally, deep learning refers to MLPs with lots of layers.
  
  Reply
Eric T May 2, 2017 at 8:59 pm #

Hi,
Would you mind if I use this code as an example of a simple network in a school project of mine?
Need to ask before using it, since I cannot find anywhere in this tutorial that you are OK with anyone using the code, and the ethics moment of my course requires me to ask (and of course give credit where credit is due).
Kind regards
Eric T

Reply
- Jason Brownlee May 3, 2017 at 7:35 am #
  
  Yes it’s fine but I take no responsibility and you must credit the source.
  
  I answer this question in my FAQ:
  https://machinelearningmastery.com/start-here/#faq
  
  Reply
BinhLN May 7, 2017 at 3:11 am #

Hi Jason
I have a problem
My Dataset have 500 record. But My teacher want my dataset have 100.000 record. I must have a new algorithm for data generation. Please help me

Reply
Dp May 11, 2017 at 2:26 am #

Can you give a deep cnn code which includes 25 layers , in the first conv layer the filter sizs should be 39×39 woth a total lf 64 filters , in the 2nd conv layer , 21 ×21 with 32 filters , in the 3rd conv layer 11×11 with 64 filters , 4th Conv layer 7×7 with 32 layers . For a input size of image 256×256. Im Competely new in this Deep learning Thing but if you can code that for me it would be a great help. Thanks

Reply
- Jason Brownlee May 11, 2017 at 8:33 am #
  
  Consider using an off-the-shelf model like VGG:
  https://keras.io/applications/
  
  Reply
Maple May 13, 2017 at 12:58 pm #

I have to follow with the facebook metrics. But the result is very low. Help me.
I changed the input but did not improve
http://archive.ics.uci.edu/ml/datasets/Facebook+metrics

Reply
- Jason Brownlee May 14, 2017 at 7:24 am #
  
  I have a list of suggestions that may help as a start:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Alessandro May 14, 2017 at 1:01 am #

Hi Jason,

Great Tutorial and thanks for your effort.

I have a question, since I am beginner with keras and tensorflow.
I have installed both of them, keras and tensorflow, the latest version and I have run your example but I get always the same error:

Traceback (most recent call last):
File “CNN.py”, line 18, in
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/models.py”, line 777, in compile
**kwargs)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/engine/training.py”, line 910, in compile
sample_weight, mask)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/engine/training.py”, line 436, in weighted
score_array = fn(y_true, y_pred)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/losses.py”, line 51, in binary_crossentropy
return K.mean(K.binary_crossentropy(y_pred, y_true), axis=-1)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py”, line 2771, in binary_crossentropy
logits=output)
TypeError: sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’

Could you help? Thanks

Alessandro

Reply
- Jason Brownlee May 14, 2017 at 7:30 am #
  
  Ouch, I have not seen this error before.
  
  Some ideas:
  – Consider trying the theano backend and see if that makes a difference.
  – Try searching/posting on the keras user group and slack channel.
  – Try searching/posting on stackoverflow or cross validated.
  
  Let me know how you go.
  
  Reply
  - Alessandro May 14, 2017 at 9:44 am #
    
    Hi Jason,
    
    I found the issue. The tensorflow installation was outdated; so I have updated it and everything
    is working nicely.
    
    Good night,
    Alessandro
    
    Reply
    - Jason Brownlee May 15, 2017 at 5:50 am #
      
      I’m glad to hear it Alessandro.
      
      Reply
Sheikh Rafiul Islam May 25, 2017 at 3:36 pm #

Thank you Mr. Brownlee for your wonderful easy to understand explanation

Reply
- Jason Brownlee June 2, 2017 at 11:41 am #
  
  Thnaks.
  
  Reply
WAZED May 29, 2017 at 12:31 am #

Hi Jason,
Thank you very much for your wonderful tutorial. I have a question regarding the metrices.Is there default way to declare metrices “Precision” and “Recall” in addtion with the “Accurace”.

Br
WAZED

Reply
- Jason Brownlee June 2, 2017 at 12:15 pm #
  
  Yes, see here:
  https://keras.io/metrics/
  
  Reply
chiranjib konwar May 29, 2017 at 4:30 am #

Hi Jason,

please send me a small note containing resources from where i can learn deep learning from scratch. thanks for the wonderful read you had prepared.

Thanks in advance

yes, my email id is chiranjib.konwar@gmail.com

Reply
- Jason Brownlee June 2, 2017 at 12:16 pm #
  
  Here:
  https://machinelearningmastery.com/start-here/#deeplearning
  
  Reply
Jeff June 1, 2017 at 11:48 am #

Why the NN have mistakes many times?

Reply
- Jason Brownlee June 2, 2017 at 12:54 pm #
  
  What do you mean exactly?
  
  Reply
kevin June 2, 2017 at 5:53 pm #

Hi Jason,

I seem to be getting an error when applying the fit method:

ValueError: Error when checking input: expected dense_1_input to have shape (None, 12) but got array with shape (767, 8)

I looked this up and the most prominent suggestion seemed to be upgrade keras and theno, which I did, but that didn’t resolve the problem.

Reply
- Jason Brownlee June 3, 2017 at 7:24 am #
  
  Ensure you have copied the code exactly from the post.
  
  Reply
Hemanth Kumar K June 3, 2017 at 2:15 pm #

hi Jason,
I am stuck with an error
TypeError: sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’
my tensor flow and keras virsions are
keras: 2.0.4
Tensorflow: 0.12

Reply
- Jason Brownlee June 4, 2017 at 7:46 am #
  
  I’m sorry to hear that, I have not seen that error before. Perhaps you could post a question to stackoverflow or the keras user group?
  
  Reply
xena June 4, 2017 at 6:36 pm #

can anyone tell me which neural network is being used here? Is it MLP??

Reply
- Jason Brownlee June 5, 2017 at 7:40 am #
  
  Yes, it is a multilayer perceptron (MLP) feedforward neural network.
  
  Reply
Nirmesh Shah June 9, 2017 at 11:00 pm #

Hi Jason,

I have run this code successfully on PC with CPU.

If I have to run the same code n another PC which contains GPU, What line should I add to make it sure that it runs on the GPU

Reply
- Jason Brownlee June 10, 2017 at 8:24 am #
  
  The code would stay the same, your configuration of the Keras backend would change.
  
  Please refer to TensorFlow or Theano documentation.
  
  Reply
Prachi June 12, 2017 at 7:30 pm #

What if I want to train my neural which should detect whether the luggage is abandoned or not ? How do i proceed for it ?

Reply
- Jason Brownlee June 13, 2017 at 8:18 am #
  
  This process will help you work through your predictive modeling problem end to end:
  https://machinelearningmastery.com/start-here/#process
  
  Reply
Ebtesam June 14, 2017 at 11:15 pm #

Hi
I was build neural machine translation model but the score i was get is 0 i am not sure why

Reply
- Jason Brownlee June 15, 2017 at 8:45 am #
  
  Here is a good list of things to try:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Sarvottam Patel June 20, 2017 at 7:31 pm #

HHey Jason , first of all thank you very much from the core of my heart to make me understand this perfectly, I have an error after completing 150 iteration.

File “keras_first_network.py”, line 53, in
print(“\n%s: %.2f” %(model.metrics_names[1]*100))
TypeError: not enough arguments for format string

Reply
- Sarvottam Patel June 20, 2017 at 8:05 pm #
  
  Sorry Sir my bad , actually I wrote it wrongly
  
  Reply
- Jason Brownlee June 21, 2017 at 8:12 am #
  
  Confirm that you have copied the line exactly:
  
  print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
  
  1
  
  print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
  
  Reply
Joydeep June 30, 2017 at 4:15 pm #

Hi Dr Jason,

Thanks for the tutorial to get started using Keras.

I used the below snippet to directly load the dataset from the URL rather than downloading and saving as this makes the code more streamlined without having to navigate elsewhere.

# load pima indians dataset
datasource = numpy.DataSource().open(“http://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data”)
dataset = numpy.loadtxt(datasource, delimiter=”,”)

Reply
- Jason Brownlee July 1, 2017 at 6:28 am #
  
  Thanks for the tip.
  
  Reply
Yvette July 7, 2017 at 9:01 pm #

Thanks for this helpful resource!

Reply
- Jason Brownlee July 9, 2017 at 10:38 am #
  
  I’m glad it helped.
  
  Reply
Andeep July 10, 2017 at 1:14 am #

Hi Dr Brownlee,

thank you very much for this great tutorial!
I would be grateful, if you could answer some questions:

1. What does the 7 in “numpy.random.seed(7)” means?

2. In my case I have 3 input neurons and 2 output neurons. Is the correct notation:
X = dataset[:,0:3]
Y = dataset[:,3:4] ?

3. The batch size means how many training data are used in one epoch, am I right?
I have thought we have to use the whole training data set for the training. In this case I would determine the batch size as the number of training data pairs I have achieved through experiments etc.. In your example, does the batch (sized 10) means that the computer always uses the same 10 training data in every epoch or are the 10 training data randomly chosen among all training data before every epoch?

4. When evaluating the model what does the loss means (e.g. in loss: 0.5105 – acc: 0.7396)?
Is it the sum of values of the error function (e.g. mean_squared_error) of the output neurons?

Reply
- Jason Brownlee July 11, 2017 at 10:19 am #
  
  You can use any random seed you like, more here:
  https://machinelearningmastery.com/reproducible-results-neural-networks-keras/
  
  You are referring to the columns in your data. Your network will also need to be configured with the correct number of inputs and outputs (e.g. input and output layers).
  
  Batch size is the number of samples in the dataset to work through before updating network weights. One epoch is comprised of one or more batches.
  
  Loss is the term being optimized by the network. Here we use log loss:
  https://en.wikipedia.org/wiki/Cross_entropy
  
  Reply
  - Andeep July 16, 2017 at 7:43 am #
    
    Thank you for your response, Dr Brownlee !!
    
    Reply
    - Jason Brownlee July 16, 2017 at 8:00 am #
      
      I hope it helps.
      
      Reply
Patrick Zawadzki July 11, 2017 at 5:35 am #

Is there anyway to see the relationship between these inputs? Essentially understand which inputs affect the output the most, or perhaps which pairs of inputs affect the output the most?

Maybe pairing this with unsupervised deep learning? I want to have less of a “black box” for the developed network if at all possible. Thank you for your great content!

Reply
- Jason Brownlee July 11, 2017 at 10:34 am #
  
  Yes, try and RFE:
  https://machinelearningmastery.com/feature-selection-machine-learning-python/
  
  Reply
Bernt July 13, 2017 at 10:12 pm #

Hi Jason,
Thank you for sharing your skills and competence.

I want to study the change in weights and predictions between each epoch run.
Have tried to use the model.train_on_batch method and the model.fit method with epoch=1 and batch_size equal all the samples.

But it seems like the model doesn’t save the new updated weights.
I print predictions before and after I dont see a change in the evaluation scores.

Parts of the code is printed below.

Any idea?
Thanks.

# Compile model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

# evaluate the model
scores = model.evaluate(X, Y)
print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))

# Run one update of the model trained run with X and compared with Y
model.train_on_batch(X, Y)

# Fit the model
model.fit(X, Y, epochs=1, batch_size=768)

scores = model.evaluate(X, Y)
print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))

Reply
- Jason Brownlee July 14, 2017 at 8:29 am #
  
  Sorry, I have not explored evaluating a Keras model this way.
  
  Perhaps it is a fault, I would recommend preparing the smallest possible example that demonstrates the issue and post to the Keras GitHub issues.
  
  Reply
iman July 18, 2017 at 11:18 pm #

Hi, I tried to apply this to the titanic data set, however the predictions were all 0.4. What do you suggest for:
# create model
model = Sequential()
model.add(Dense(12, input_dim=4, activation=’relu’))
model.add(Dense(4, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))

model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’]) #’sgd’

model.fit(X, Y, epochs=15, batch_size=10)

Reply
- Jason Brownlee July 19, 2017 at 8:26 am #
  
  This post will give you some ideas to list the skill of your model:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Camus July 19, 2017 at 2:14 am #

Hi Dr Jason,
This is probably a stupid question but I cannot find out how to do it … and I am beginner on Neural Network.
I have relatively same number of inputs (7) and one output. This output can take numbers between -3000 and +3000.
I want to build a neural network model in python but I don’t know how to do it.
Do you have an example with outputs different from 0-1.
Tanks in advance

Camus

Reply
- Jason Brownlee July 19, 2017 at 8:28 am #
  
  Ensure you scale your data then use the above tutorial to get started.
  
  Reply
Khalid Hussain July 21, 2017 at 11:28 pm #

Hi Jason Brownlee

I am using the same data “pima-indians-diabetes.csv” but all predicted values are less then 1 and are in fraction which could not distinguish any class.

If I round off then all become 0.

I am using model.predict(x) function

You are requested to kindly guide me what I am doing wrong are how can I achieve correct predicted value.

Thank you

Reply
- Jason Brownlee July 22, 2017 at 8:36 am #
  
  Consider you have copied all of the code exactly from the tutorial.
  
  Reply
Ludo July 25, 2017 at 6:59 pm #

Hello Jason,

Thanks you for your great example. I have some comments.

– Why you have choice “12” inputs hidden layers ? and not 24 / 32 .. it’s arbitary ?
– Same question about epochs and batch_size ?

This value are very sensible !! i have try with 32 inputs first layer , epchos=500 and batch_size=1000 and the result is very differents… i’am at 65% accurancy.

Thx for you help.
Regards.

Reply
- Jason Brownlee July 26, 2017 at 7:50 am #
  
  Yes, it is arbitrary. Tune the parameters of the model to your problem.
  
  Reply
Almoutasem Bellah Rajab July 25, 2017 at 7:32 pm #

Wow, you’re still replying to comments more than a year later!!!… you’re great,, thanks..

Reply
- Jason Brownlee July 26, 2017 at 7:50 am #
  
  Yep.
  
  Reply
Jane July 26, 2017 at 1:23 am #

Thanks for your tutorial, I found it very useful to get me started with Keras. I’ve previously tried TensorFlow, but found it very difficult to work with. I do have a question for you though. I have both Theano and TensorFlow installed, how do I know which back-end Keras is using? Thanks again

Reply
- Jason Brownlee July 26, 2017 at 8:02 am #
  
  Keras will print which backend it uses every time you run your code.
  
  You can change the backend in the Keras configuration file (~/.keras/keras.json) which looks like:
  
  { "image_data_format": "channels_last", "backend": "tensorflow", "epsilon": 1e-07, "floatx": "float32" }
  
  1
  2
  3
  4
  5
  6
  
  {
      "image_data_format": "channels_last",
      "backend": "tensorflow",
      "epsilon": 1e-07,
      "floatx": "float32"
  }
  
  Reply
Masood Imran July 28, 2017 at 12:00 am #

Hello Jason,

My understanding of Machine Learning or evaluating deep learning models is almost 0. But, this article gives me lot of information. It is explained in a simple and easy to understand language.

Thank you very much for this article. Would you suggest any good read to further explore Machine Learning or deep learning models please?

Reply
- Jason Brownlee July 28, 2017 at 8:31 am #
  
  Thanks.
  
  Yes, start right here:
  https://machinelearningmastery.com/start-here/#deeplearning
  
  Reply
Peggy August 3, 2017 at 7:14 pm #

If I have trained prediction models or neural network function scripts. How can I use them to make predictions in an application that will be used by end users? I want to use python but it seems I will have to redo the training in Python again. Is there a way I can rewrite the scripts in Python without retraining and just call the function of predicting?

Reply
- Jason Brownlee August 4, 2017 at 6:58 am #
  
  You need to train and save the final model then load it to make predictions.
  
  This post will make it clear:
  https://machinelearningmastery.com/train-final-machine-learning-model/
  
  Reply
Shane August 8, 2017 at 2:38 pm #

Jason, I used your tutorial to install everything needed to run this tutorial. I followed your tutorial and ran the resulting program successfully. Can you please describe what the output means? I would like to thank you for your very informative tutorials.

Reply
- Shane August 8, 2017 at 2:39 pm #
  
  768/768 [==============================] – 0s – loss: 0.4807 – acc: 0.7826
  Epoch 148/150
  768/768 [==============================] – 0s – loss: 0.4686 – acc: 0.7812
  Epoch 149/150
  768/768 [==============================] – 0s – loss: 0.4718 – acc: 0.7617
  Epoch 150/150
  768/768 [==============================] – 0s – loss: 0.4772 – acc: 0.7812
  32/768 [>………………………..] – ETA: 0s
  acc: 77.99%
  
  Reply
  - Jason Brownlee August 8, 2017 at 5:12 pm #
    
    It is summarizing the training of the model.
    
    The final line evaluates the accuracy of the model’s predictions – really just to demonstrate how to make predictions.
    
    Reply
- Jason Brownlee August 8, 2017 at 5:11 pm #
  
  Well done Shane.
  
  Which output?
  
  Reply
Bene August 9, 2017 at 1:02 am #

Hello Jason, i really liked your Work and it helped me a lot with my first steps.

But i am not really familiar with the numpy stuff:

So here is my Question:

dataset = numpy.loadtxt(“pima-indians-diabetes.csv”, delimiter=”,”)
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

I get that the numpy.loadtxt is extracting the information from the cvs File

but what does the stuff in the Brackets mean like X = dataset[:,0:8]

why the “:” and why , 0:8

its probably pretty dumb but i can’t find a good explanation online 😀

thanks really much!

Reply
- Jason Brownlee August 9, 2017 at 6:37 am #
  
  Good question Bene, it’s called array slicing:
  https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
  
  Reply
  - Bene August 9, 2017 at 10:59 pm #
    
    That helped me out tank you Jason 🙂
    
    Reply
Chen August 12, 2017 at 5:43 pm #

Can I translate it to Chinese and put it to Internet in order to let other Chinese people can read your article?

Reply
- Jason Brownlee August 13, 2017 at 9:46 am #
  
  No, please do not.
  
  Reply
Deep Learning August 12, 2017 at 7:36 pm #

It seems that using this line:

np.random.seed(5)

…is redundant i.e. the Keras output in a loop running the same model with the same configuration will yield a similar variety of results regardless if it’s set at all, or which number it is set to. Or am I missing something?

Reply
- Jason Brownlee August 13, 2017 at 9:52 am #
  
  Deep learning algorithms are stochastic (random within a range). That means that they will make different predictions/learn different things when the same model is trained on the same data. This is a feature:
  https://machinelearningmastery.com/randomness-in-machine-learning/
  
  You can fix the random seed to ensure you get the same result, and it is a good idea for tutorials to help beginners out:
  https://machinelearningmastery.com/reproducible-results-neural-networks-keras/
  
  When evaluating the skill of a model, I would recommend repeating the experiment n times and taking skill as the average of the runs. See here for the procedure:
  https://machinelearningmastery.com/evaluate-skill-deep-learning-models/
  
  Does that help?
  
  Reply
  - Deep Learning August 14, 2017 at 3:08 am #
    
    Thanks Jason 🙂
    
    I totally get what it should do, but as I had pointed out, it does not do it. If you run the codes you have provided above in a loop for say 10 times. First 10 with random seed set and the other 10 times without that line of code all together. Then compare the result. At least the result I’m getting, is suggesting the effect is not there i.e. both sets of 10 times will have similar variation in the result.
    
    Reply
    - Jason Brownlee August 14, 2017 at 6:26 am #
      
      It may suggest that the model is overprescribed and easily addresses the training data.
      
      Reply
Deep Learning August 14, 2017 at 3:12 am #

Nice post by the way > https://machinelearningmastery.com/evaluate-skill-deep-learning-models/

Thanks for sharing it. Been lately thinking about the aspect of accuracy a lot, it seems that at the moment it’s a “hot mess” in terms of the way common tools do it out of the box. I think a lot of non PhD / non expert crowd (most people) will at least initially be easily confused and make the kinds of mistakes you point out in your post.

Thanks for all the amazing contributions you are making in this field!

Reply
- Jason Brownlee August 14, 2017 at 6:26 am #
  
  I’m glad it helped.
  
  Reply
  - Haneesh December 7, 2019 at 10:36 pm #
    
    Hi Jason,
    
    i’m actually trying to find “spam filter for quora questions” where i have a dataset with label-0’s and 1’s and questions columns. please let me know the approach and path to build a model for this.
    
    Thanks
    
    Reply
    - Jason Brownlee December 8, 2019 at 6:10 am #
      
      Sounds like a great project.
      
      The tutorials here on text classification will help:
      https://machinelearningmastery.com/start-here/#nlp
      
      Reply
RATNA NITIN PATIL August 14, 2017 at 8:16 pm #

Hello Jason, Thanks for a wonderful tutorial.
Can I use Genetic Algorithm for feature selection??
If yes, Could you please provide the link for it???
Thanks in advance.

Reply
- Jason Brownlee August 15, 2017 at 6:34 am #
  
  Sure. Sorry, I don’t have any examples.
  
  Generally, computers are so fast it might be easier to test all combinations in an exhaustive search.
  
  Reply
sunny1304 August 15, 2017 at 3:44 pm #

Hi Json,
Thank you for your awesome tutorial.
I have a question for you.

Is there any guideline on how to decide on neuron number for our network.
for example you used 12 for thr 1st layer and 8 for the second layer.
how do you decide on that ?

Thanks

Reply
- Jason Brownlee August 15, 2017 at 4:58 pm #
  
  No, there is no way to analytically determine the configuration of the network.
  
  I use trial and error. You can grid search, random search, or copy configurations from tutorials or papers.
  
  Reply
yihadad August 16, 2017 at 6:53 pm #

Hi Json,
Thanks for a wonderful tutorial.

Run a model generated by a CNN it takes how much ram, cpu ?

Thanks

Reply
- Jason Brownlee August 17, 2017 at 6:39 am #
  
  It depends on the data you are using to fit the model and the size of the model.
  
  Very large models could be 500MB of RAM or more.
  
  Reply
Ankur September 1, 2017 at 3:15 am #

Hi ,
Please let me know , how can i visualise the complete neural network in Keras……………….

I am looking for the complete architecture – like number of neurons in the Input Layer, hidden layer , output layer with weights.

Please have a look at the link present below, here someone has created a beutiful visualisation/architecture using neuralnet package in R.
Please let me know, can we create such type of model in KERAS

https://www.r-bloggers.com/fitting-a-neural-network-in-r-neuralnet-package/

Reply
- Jason Brownlee September 1, 2017 at 6:50 am #
  
  Use the Keras visualization API:
  https://keras.io/visualization/
  
  Reply
- ASAD October 17, 2017 at 3:23 am #
  
  Hello ANKUR,,,, how are you?
  
  you have try visualization in keras which is suggested by Jason Brownlee?
  if you have tried then please send me code i am also trying but didnot work..
  
  please guide me
  
  Reply
Adam September 3, 2017 at 1:45 am #

Thank you Dr. Brownlee for the great tutorial,

I have a question about your code:
is the argument metrics=[‘accuracy’] necessary in the code and does it change the results of the neural network or is it just for showing me the accuracy during compiling?

thank you!!

Reply
- Jason Brownlee September 3, 2017 at 5:48 am #
  
  No, it just prints out the accuracy of the model at the end of each epoch. Learn more about Keras metrics here:
  https://machinelearningmastery.com/custom-metrics-deep-learning-keras-python/
  
  Reply
PottOfGold September 5, 2017 at 12:14 am #

Hi Jason,

your work here is really great. It helped me a lot.
I recently stumbled upon one thing I cannot understand:

For the pimas dataset you state:
<>
When I look at the table of the pimas dataset, the examples are in rows and the features in columns, so your input dimension is the number of columns. As far as I can see, you don’t change the table.

For neural networks, isn’t the input normally: examples = columns, features=rows?
Is this different for Keras? Or can I use both shapes? An if yes, what’s the difference in the construction of the net?

Thank you!!

Reply
- Jason Brownlee September 7, 2017 at 12:36 pm #
  
  No, features are columns, rows are instances or examples.
  
  Reply
  - PottOfGold September 7, 2017 at 3:35 pm #
    
    Thanks! 🙂
    I had a lot of discussions because of that.
    In Andrew Ng new Coursera course it’s explained as examples = columns, features=rows, but he doesn’t use Keras of course, but programms the neural networks from scratch.
    
    Reply
    - Jason Brownlee September 9, 2017 at 11:38 am #
      
      I doubt that, I think you may have mixed it up. Columns are never examples.
      
      Reply
      - PottOfGold October 6, 2017 at 6:26 pm #
        
        Thats what I thought, but I looked it up in the notation for the new coursera course (deeplearning.ai) and there it says: m is the numer of examples in the dataset and n is the input size, where X superscript n x m is the input matrix …
        But either way, you helped me! Thank you. 🙂
Lin Li September 16, 2017 at 1:50 am #

Hi Jason, thank you so much for your tutorial, it helps me a lot. I need your help for the question below:
I copy the code and run it. Although I got the classification results, there were some warning messages in the process. As follows:

Warning (from warnings module):
File “C:\Users\llfor\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\callbacks.py”, line 120
% delta_t_median)
UserWarning: Method on_batch_end() is slow compared to the batch update (0.386946). Check your callbacks.

I don’t know why, and cannot find any answer to this question. I’m looking forward to your reply. Thanks again!

Reply
- Jason Brownlee September 16, 2017 at 8:43 am #
  
  Sorry, I have not seen this message before. It looks like a warning, you might be able to ignore it.
  
  Reply
  - Lin Li September 16, 2017 at 12:24 pm #
    
    Thanks for your reply. I’m a start-learner on deep learning.I’d like to put it aside temporarily.
    
    Reply
Sagar September 22, 2017 at 2:51 pm #

Hi Jason,
Great article, thumbs up for that. I am getting this error when I try to run the file on the command prompt. Any suggestions. Thanks for you response.

#######################################################################
C:\Work\ML>python keras_first_network.py
Using TensorFlow backend.
2017-09-22 10:11:11.189829: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\
36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
‘t compiled to use AVX instructions, but these are available on your machine and
could speed up CPU computations.
2017-09-22 10:11:11.190829: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\
36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
‘t compiled to use AVX2 instructions, but these are available on your machine an
d could speed up CPU computations.
32/768 [>………………………..] – ETA: 0s
acc: 78.52%
#######################################################################

Reply
- Jason Brownlee September 23, 2017 at 5:35 am #
  
  Looks like warning messages that you can ignore.
  
  Reply
  - Sagar September 24, 2017 at 3:52 am #
    
    Thanks I got to know what the problem was. According to section 6 I had set verbose argument to 0 while calling “model.fit()”. Now all the epochs are getting printed.
    
    Reply
    - Jason Brownlee September 24, 2017 at 5:17 am #
      
      Glad to hear it.
      
      Reply
Valentin September 26, 2017 at 6:35 pm #

Hi Jason,

Thanks for the amazing article . Clear and straightforward.
I had some problems installing Keras but was advised to prefix
with tf.contrib.keras
so I have code like

model=tf.contrib.keras.models.Sequential()
Dense=tf.contrib.keras.layers.Dense

Now I try to train Keras on some small datafile to see how things work out:
1,1,0,0,8
1,2,1,0,4
1,0,0,1,5
1,0,1,0,7
0,1,0,0,8
1,4,1,0,4
1,0,2,1,1
1,0,1,0,7

The first 4 columns are inputs and the 5-th column is output.
I use the same code for training (adjust number of inputs) as in your article,
but the network only gets to 12.5% accuracy.
Any advise?

Thanks,
Valentin

Reply
- Jason Brownlee September 27, 2017 at 5:40 am #
  
  Thanks Valentin.
  
  I have a good list of suggestions for improving model performance here:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Priya October 3, 2017 at 2:28 pm #

Hi Jason,

I tried replacing the pima data with random data as follows:

X_train = np.random.rand(18,61250)
X_test = np.random.rand(18,61250)
Y_train = np.array([0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,])
Y_test = np.array([1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,])

_, input_size = X_train.shape #put this in input_dim in the first dense layer

I took the round() off of the predictions so I could see the full value and then inserted my random test data in model.fit():

predictions = model.predict(X_test)
preds = [x[0] for x in predictions]
print(preds)

model.fit(X_train, Y_train, epochs=100, batch_size=10, verbose=2, validation_data=(X_test,Y_test))

I found something slightly odd; I expected the predicted values to be around 0.50, plus or minus some, but instead, I got this:

[0.49525392, 0.49652839, 0.49729034, 0.49670222, 0.49342978, 0.49490061, 0.49570397, 0.4962129, 0.49774086, 0.49475089, 0.4958384, 0.49506786, 0.49696651, 0.49869373, 0.49537542, 0.49613148, 0.49636957, 0.49723724]

which is near 0.50 but always less than 0.50. I ran this a few times with different random seeds, so it’s not coincidental. Would you have any explanation for why it does this?

Thanks,
Priya

Reply
- Jason Brownlee October 3, 2017 at 3:46 pm #
  
  Perhaps calculate the mean of your training data and compare it to the predicted value. It might be simple sampling error.
  
  Reply
- Priya October 4, 2017 at 1:02 am #
  
  I found out I was doing predictions before fitting the model. (I suppose that would mean the network hadn’t adjusted to the data’s distribution yet.)
  
  Reply
Saurabh October 7, 2017 at 5:59 am #

Hello Jason,

I tried to train this model on my laptop, it is working fine. But I tried to train this model on google-cloud with the same instructions as in your example-5. But it is failing.
Can you just let me know, which changes are to required for the model, so that I can train this on cloud.

Reply
- Jason Brownlee October 7, 2017 at 7:37 am #
  
  Sorry, I don’t know about google cloud.
  
  I have instructions here for running on AWS:
  https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/
  
  Reply
tobegit3hub October 12, 2017 at 6:40 pm #

Great post. Thanks for sharing.

Reply
- Jason Brownlee October 13, 2017 at 5:45 am #
  
  You’re welcome.
  
  Reply
Manoj October 12, 2017 at 11:43 pm #

Hi Jason,
Is there a way to store the model, once it is created so that I can use it for different input data sets as and when needed.

Reply
- Jason Brownlee October 13, 2017 at 5:48 am #
  
  Yes, you can save it to file. See this tutorial:
  https://machinelearningmastery.com/save-load-machine-learning-models-python-scikit-learn/
  
  Reply
Cam October 23, 2017 at 6:11 pm #

I get a syntax error for the

model.fit() line in this example. Is it due to library conflicts with theano and tensorflow if i have both installed?

Reply
- Jason Brownlee October 24, 2017 at 5:28 am #
  
  Perhaps ensure your environment is up to date and that you copied the code exactly.
  
  This tutorial can help with setting up your environment:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  Reply
  - Cam October 24, 2017 at 2:11 pm #
    
    Thanks, fixed!
    
    Reply
    - Jason Brownlee October 24, 2017 at 4:01 pm #
      
      Glad to hear it.
      
      Reply
Diego Quintana October 25, 2017 at 7:37 am #

Hi Jason, thanks for the example.

How would you predict a single element from X? X[0] raises a ValueError

ValueError: Error when checking : expected dense_1_input to have shape (None, 8) but got array with shape (8, 1)

Thanks!

Reply
- Jason Brownlee October 25, 2017 at 3:56 pm #
  
  You can reshape it to have 1 row and 8 columns:
  
  X = X.reshape((1,8))
  
  1
  
  X = X.reshape((1,8))
  
  This post will give you further advice:
  https://machinelearningmastery.com/index-slice-reshape-numpy-arrays-machine-learning-python/
  
  Reply
  - harald April 10, 2019 at 8:26 pm #
    
    Should it be: X[0].reshape((1,8)) ?
    
    Reply
    - Jason Brownlee April 11, 2019 at 6:35 am #
      
      Yep!
      
      Reply
Shahbaz Wasti October 28, 2017 at 1:30 pm #

Dear Sir ,
I have installed and configured the environment according to your directions but while running the program i have following error

“from keras.utils import np_utils”

Reply
- Jason Brownlee October 29, 2017 at 5:50 am #
  
  What is the error exactly?
  
  Reply
Zhengping October 30, 2017 at 12:12 am #

Hi Jason, thanks for the great tutorials. I just learnt and repeated the program in your “Your First Machine Learning Project in Python Step-By-Step” without problem. Now trying this one, getting stuck at the line “model = Sequential()” when the Interactive window throws: NameError: name ‘Sequential’ is not defined. tried to google, can’t find a solution. I did import Sequential from keras.models as in ur example code. copy pasted as it is. Thanks in advance for your help.

Reply
- Zhengping October 30, 2017 at 12:14 am #
  
  I’m running ur examples in Anaconda 4.4.0 environment in visual studio community version. relevant packages have been installed as in ur earlier tutorials instructed.
  
  Reply
  - Zhengping October 30, 2017 at 12:18 am #
    
    >> # create model
    … model = Sequential()
    …
    Traceback (most recent call last):
    File “”, line 2, in
    NameError: name ‘Sequential’ is not defined
    >>> model.add(Dense(12, input_dim=8, init=’uniform’, activation=’relu’))
    Traceback (most recent call last):
    File “”, line 1, in
    AttributeError: ‘SVC’ object has no attribute ‘add’
    
    Reply
    - Jason Brownlee October 30, 2017 at 5:39 am #
      
      This does not look good. Perhaps post the error to stack exchange or other keras support. I have a list of keras support sites here:
      https://machinelearningmastery.com/get-help-with-keras/
      
      Reply
- Jason Brownlee October 30, 2017 at 5:38 am #
  
  Looks like you need to install Keras. I have a tutorial here on how to do that:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  Reply
Akhil October 30, 2017 at 5:04 pm #

Ho Jason,

Thanks a lot for this wonderful tutorial.

I have a question:

I want to use your code to predict the classification (1 or 0) of unknown samples. Should I create one common csv file having the train (known) as well as the test (unknown) data. Whereas the ‘classification’ column for the known data will have a known value, 1 or 0, for the unknown data, should I leave the column empty (and let the code decide the outcome)?

Thanks a lot

Reply
- Jason Brownlee October 31, 2017 at 5:29 am #
  
  Great question.
  
  No, you only need the inputs and the model can predict the outputs, call model.predict(X).
  
  Also, this post will give a general idea on how to fit a final model:
  https://machinelearningmastery.com/train-final-machine-learning-model/
  
  Reply
Guilherme November 3, 2017 at 1:26 am #

Hi Jason,

This is really cool! I am blown away! Thanks so much for making it so simple for a beginner to have some hands on. I have a couple questions:

1) where are the weights, can I save and/or retrieve them?

2) if I want to train images with dogs and cats and later ask the neural network whether a new image has a cat or a dog, how do I get my input image to pass as an array and my output result to be “cat” or “dog”?

Thanks again and great job!

Reply
- Jason Brownlee November 3, 2017 at 5:20 am #
  
  The weights are in the model, you can save them:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  Yes, you would save your model, then call model.predict() on the new data.
  
  Reply
Michael November 5, 2017 at 8:33 am #

Hi Jason,

Are you familiar with a python tool/package that can build neural network as in the tutorial, but suitable for data stream mining?

Thanks,
Michael

Reply
- Jason Brownlee November 6, 2017 at 4:46 am #
  
  Not really, sorry.
  
  Reply
bea November 8, 2017 at 1:58 am #

Hi, there. Could you please clarify why exactly you’ve built your network with 12 neurons in the first layer?

“The first layer has 12 neurons and expects 8 input variables. The second hidden layer has 8 neurons and finally, the output layer has 1 neuron to predict the class (onset of diabetes or not)…”

Should’nt it have 8 neurons at the start?

Thanks

Reply
- Jason Brownlee November 8, 2017 at 9:28 am #
  
  The input layer has 8, the first hidden layer has 12. I chose 12 through a little trial and error.
  
  Reply
Guilherme November 9, 2017 at 12:54 am #

Hi Jason,

Do you have or else could you recommend a beginner’s level image segmentation approach that uses deep learning? For example, I want to train some neural net to automatically “find” a particular feature out of an image.

Thanks!

Reply
- Jason Brownlee November 9, 2017 at 10:00 am #
  
  Sorry, I don’t have image segmentation examples, perhaps in the future.
  
  Reply
Andy November 12, 2017 at 6:56 pm #

Hi Jason,

I just started my DL training a few weeks ago. According to what I learned in course, in order to train the parameters for the NN, we need to run the Forward and Backward propagation; however, looking at your Keras example, i don’t find any of these propagation processes. Does it mean that Keras has its own mechanism to find the parameters instead of using Forward and Backward propagation?

Thanks!

Reply
- Jason Brownlee November 13, 2017 at 10:13 am #
  
  It is performing those operations under the covers for you.
  
  Reply
Badr November 13, 2017 at 11:42 am #

Hi Jason,

Can you explain why I got the following output:

ValueError Traceback (most recent call last)
in ()
—-> 1 model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
2 model.fit(X, Y, epochs=150, batch_size=10)
3 scores = model.evaluate(X, Y)
4 print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))

/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/models.py in compile(self, optimizer, loss, metrics, sample_weight_mode, **kwargs)
545 metrics=metrics,
546 sample_weight_mode=sample_weight_mode,
–> 547 **kwargs)
548 self.optimizer = self.model.optimizer
549 self.loss = self.model.loss

/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/engine/training.py in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, **kwargs)
620 loss_weight = loss_weights_list[i]
621 output_loss = weighted_loss(y_true, y_pred,
–> 622 sample_weight, mask)
623 if len(self.outputs) > 1:
624 self.metrics_tensors.append(output_loss)

/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/engine/training.py in weighted(y_true, y_pred, weights, mask)
322 def weighted(y_true, y_pred, weights, mask=None):
323 # score_array has ndim >= 2
–> 324 score_array = fn(y_true, y_pred)
325 if mask is not None:
326 # Cast the mask to floatX to avoid float64 upcasting in theano

/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/objectives.py in binary_crossentropy(y_true, y_pred)
46
47 def binary_crossentropy(y_true, y_pred):
—> 48 return K.mean(K.binary_crossentropy(y_pred, y_true), axis=-1)
49
50

/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in binary_crossentropy(output, target, from_logits)
1418 output = tf.clip_by_value(output, epsilon, 1 – epsilon)
1419 output = tf.log(output / (1 – output))
-> 1420 return tf.nn.sigmoid_cross_entropy_with_logits(output, target)
1421
1422

/Users/badrshomrani/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/nn_impl.py in sigmoid_cross_entropy_with_logits(_sentinel, labels, logits, name)
147 # pylint: disable=protected-access
148 nn_ops._ensure_xent_args(“sigmoid_cross_entropy_with_logits”, _sentinel,
–> 149 labels, logits)
150 # pylint: enable=protected-access
151

/Users/badrshomrani/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py in _ensure_xent_args(name, sentinel, labels, logits)
1696 if sentinel is not None:
1697 raise ValueError(“Only call %s with ”
-> 1698 “named arguments (labels=…, logits=…, …)” % name)
1699 if labels is None or logits is None:
1700 raise ValueError(“Both labels and logits must be provided.”)

ValueError: Only call sigmoid_cross_entropy_with_logits with named arguments (labels=…, logits=…, …)

Reply
- Jason Brownlee November 14, 2017 at 10:05 am #
  
  Perhaps double check you have the latest versions of the keras and tensorflow libraries installed?!
  
  Reply
Badr November 14, 2017 at 10:50 am #

keras was outdated

Reply
- Jason Brownlee November 15, 2017 at 9:44 am #
  
  Glad to hear you fixed it.
  
  Reply
Mikael November 22, 2017 at 8:20 am #

Hi Jason, thanks for your short tutorial, helps a lot to actually get your hands dirty with a simple example.
I have tried 5 different parameters and got some interesting results to see what would happen. Unfortunately, I didnt record running time.

Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7
number of layers 3 3 3 3 3 3 4
Train set 768 768 768 768 768 768 768
Iterations 150 100 1000 1000 1000 150 150
Rate of update 10 10 10 5 1 1 5
Errors 173 182 175 139 161 169 177
Values 768 768 768 768 768 768 768
% Error 23,0000% 23,6979% 22,7865% 18,0990% 20,9635% 22,0052% 23,0469%

I can’t seem to see a trend here.. That could put me on the right track to adjust my hyperparameters.

Do you have any advice on that?

Reply
- Jason Brownlee November 22, 2017 at 11:17 am #
  
  Something is wrong. Here is a good list of things to try:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply

Nikolaos November 28, 2017 at 10:58 am #

Hi, I try to implement the above example with fer2013.csv but I receive an error, it is possible to help me to implement this correctly?

keras.models import Sequential
from keras.layers import Dense
import numpy
import numpy as np

# fix Random seed for reproducibility
numpy.random.seed(7)
Y = []
X = []
#load dataset
for line in open("fer2013.csv"):
    row = line.split(',')
    Y.append(int(row[0]))
    X.append([int(p) for p in row[1].split()])
X, Y = np.array(X) / 255.0, np.array(Y)
print(Y.shape)
print(X.shape)


#create model
model = Sequential()
model.add(Dense(12, input_dim=(35887, 2304), activation='tanh'))
model.add(Dense(8, activation='tanh'))
model.add(Dense(1, activation='sigmoid'))

#Compile Model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

#Fit Model
model.fit(X, Y, epochs=150, batch_size=1)

# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

# calculate predictions
predictions = model.predict(X)
# round predictions
rounded = [round(x[0]) for x in predictions]
print(rounded)

keras.models import Sequential

from keras.layers import Dense

import numpy

import numpy as np

# fix Random seed for reproducibility

numpy.random.seed(7)

Y = []

X = []

#load dataset

for line in open("fer2013.csv"):

row = line.split(',')

Y.append(int(row[0]))

X.append([int(p) for p in row[1].split()])

X, Y = np.array(X) / 255.0, np.array(Y)

print(Y.shape)

print(X.shape)

#create model

model = Sequential()

model.add(Dense(12, input_dim=(35887, 2304), activation='tanh'))

model.add(Dense(8, activation='tanh'))

model.add(Dense(1, activation='sigmoid'))

#Compile Model

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

#Fit Model

model.fit(X, Y, epochs=150, batch_size=1)

# evaluate the model

scores = model.evaluate(X, Y)

print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

# calculate predictions

predictions = model.predict(X)

# round predictions

rounded = [round(x[0]) for x in predictions]

print(rounded)

Jason Brownlee November 29, 2017 at 8:10 am #

Sorry, I cannot debug your code.

What is the problem exactly?

Reply

Tanya December 2, 2017 at 12:06 am #

Hello,
i have a a bit general question.
I have to do a forecasting for restaurant sales (meaning that I have to predict 4 meals based on a historical daily sales data), weather condition (such as temperature, rain, etc), official holiday and in-off-season. I have to perform that forecasting using neuronal networks.
I am unfortunately not a very skilled in python. On my computer I have Python 2.7 and I have install anaconda. I am trying to learn exercising with your codes, Mr. Brownlee. But somehow I can not run the code at all (in Spyder). Can you tell me what kind of version of python and anaconda I have to install on my computer and in which environment (jupiterlab,notebook,qtconsole, spyder, etc) I can run the code, so to work and not to give error from the very beginning?
I will be very thankful for your response
KG
Tanya

Reply
- Jason Brownlee December 2, 2017 at 9:02 am #
  
  Perhaps this tutorial will help you setup and confirm your environment:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  I would also recommend running code from the command like as IDEs and notebooks can introduce and hide errors.
  
  Reply
Eliah December 3, 2017 at 10:53 am #

Hi Dr. Brownlee.

I looked over the tutorial and I had a question regarding reading the data from a binary file? For instance I working on solving the sliding tiled n-puzzle using neural networks, but I seem to have trouble to getting my data which is in a binary file and it generates the number of move required for the n-puzzle to be solve in. Am not sure if you have dealt with this before, but any help would be appreciated.

Reply
- Jason Brownlee December 4, 2017 at 7:43 am #
  
  Sorry, I don’t know about your binary file.
  
  Perhaps after you load your data, you can convert it to a numpy array so that you can provide it to a neural net?
  
  Reply
  - Eliah December 4, 2017 at 9:28 am #
    
    Thanks for the tip, I’ll try it.
    
    Reply
Wafaa December 7, 2017 at 4:59 pm #

Thank you very very much for all your great tutorials.

If I wanted to add batch layer after the input layer, how should I do it?

Cuz I applied this tutorial on a different dataset and features and I think I need normalization or standardization and I want to do it the easiest way.

Thank you,

Reply
- Jason Brownlee December 8, 2017 at 5:35 am #
  
  I recommend preparing the data prior to fitting the model.
  
  Reply
zaheer December 9, 2017 at 3:03 am #

thanks for sharing such nice tutorials, it helped me alot. i want to print the confusion matrix from the above example. and one more question.
if i have
20-input variable
1- class label (binary)
and 400 instances
how i would know , setting up the dense layer parameter in the first layer and hidden layer and output layer. like above example you have placed. 12,8,1

Reply
- Jason Brownlee December 9, 2017 at 5:44 am #
  
  I recommend trial and error to configure the number of neurons in the hidden layer to see what works best for your specific problem.
  
  Reply
zaheer December 9, 2017 at 3:29 am #

C:\Users\zaheer\AppData\Local\Programs\Python\Python36\python.exe C:/Users/zaheer/PycharmProjects/PythonBegin/Bin-CLNCL-Copy.py
Using TensorFlow backend.
Traceback (most recent call last):
File “C:/Users/zaheer/PycharmProjects/PythonBegin/Bin-CLNCL-Copy.py”, line 28, in
model.fit(x_train , y_train , epochs=100, batch_size=100)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\models.py”, line 960, in fit
validation_steps=validation_steps)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py”, line 1574, in fit
batch_size=batch_size)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py”, line 1407, in _standardize_user_data
exception_prefix=’input’)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py”, line 153, in _standardize_input_data
str(array.shape))
ValueError: Error when checking input: expected dense_1_input to have shape (None, 20) but got array with shape (362, 1)

Reply
- Jason Brownlee December 9, 2017 at 5:45 am #
  
  Ensure the input shape matches your data.
  
  Reply
Anam Zahra December 10, 2017 at 7:40 pm #

Dear Jason! Great job a very simple guide.
I am trying to run the exact code but there is an eror
str(array.shape))

ValueError: Error when checking target: expected dense_3 to have shape (None, 1) but got array with shape (768, 8)

How can I resolve.

I have windows 10 and spyder.

Reply
- Jason Brownlee December 11, 2017 at 5:24 am #
  
  Sorry to hear that, perhaps confirm that you have the latest version of Numpy and Keras installed?
  
  Reply
nazek hassouneh December 11, 2017 at 7:33 am #

after run this code , i will calculate the accuracy , how i did , i
i want to split the data set into test data , training data
and evaluate the model and calculate the accuracy
thank dr.

Reply
Suchith December 21, 2017 at 2:35 pm #

In the model how many hidden layers are there ?

Reply
- Jason Brownlee December 21, 2017 at 3:35 pm #
  
  There are 2 hidden layers, 1 input layer and 1 output layer.
  
  Reply
Amare Mahtesenu December 22, 2017 at 9:55 am #

hi there. this blog is very awesome like the Adrian’s pyimagesearch blog. I have one question and that is do you have or will you have a tutorial on keras frame work with SSD or Yolo architechtures?

Reply
- Jason Brownlee December 22, 2017 at 4:16 pm #
  
  Thanks for the suggestion, I hope to cover them in the future.
  
  Reply
Kyujin Chae January 8, 2018 at 2:22 pm #

Thanks for your awesome article.
I am really enjoying
‘Machine Learning Mastery’!!

Reply
- Jason Brownlee January 8, 2018 at 3:54 pm #
  
  Thanks!
  
  Reply
Luis Galdo January 9, 2018 at 8:41 am #

Hello Jason!

This is an awesome article!
I am writing a report for a subject in university and I have used your code during my implementation, would it be possible to cite this post in bibtex?

Thank you!

Reply
- Jason Brownlee January 9, 2018 at 3:17 pm #
  
  Sure, you can cite the webpage directly.
  
  Reply
Nikhil Gupta January 25, 2018 at 8:05 pm #

My question is regarding predict. I used to get decimals in the prediction array. Suddenly, I started seeing only Integers (0 or 1) in the run. Any idea what could be causing the change?

predictions = model.predict(X2)

predictions
Out[3]:
array([[ 0.],
[ 0.],
[ 0.],
…,
[ 0.],
[ 0.],
[ 0.]], dtype=float32)

Reply
- Jason Brownlee January 26, 2018 at 5:39 am #
  
  Perhaps check the activation function on the output layer?
  
  Reply
  - Nikhil Gupta January 28, 2018 at 3:30 am #
    
    # create model. Fully connected layers are defined using the Dense class
    model = Sequential()
    model.add(Dense(12, input_dim=len(x_columns), activation=’relu’)) #12 neurons, 8 inputs
    model.add(Dense(8, activation=’relu’)) #Hidden layer with 8 neurons
    model.add(Dense(1, activation=’sigmoid’)) #1 output layer. Sigmoid give 0/1
    
    Reply
joe January 27, 2018 at 1:25 am #

================== RESTART: /Users/apple/Documents/deep1.py ==================
Using TensorFlow backend.

Traceback (most recent call last):
File “/Users/apple/Documents/deep1.py”, line 20, in
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/models.py”, line 826, in compile
**kwargs)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/engine/training.py”, line 827, in compile
sample_weight, mask)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/engine/training.py”, line 426, in weighted
score_array = fn(y_true, y_pred)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/losses.py”, line 77, in binary_crossentropy
return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py”, line 3069, in binary_crossentropy
logits=output)
TypeError: sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’
>>>

Reply
- Jason Brownlee January 27, 2018 at 5:58 am #
  
  I have not seem this error, sorry. Perhaps try posting to stack overflow?
  
  Reply
Atefeh January 27, 2018 at 4:04 pm #

Hello Mr.Janson
After installing Anaconda and deep learning libraries, I read your Free mini-course and I tried to write the code about the handwritten digit recognition.
I wrote the codes in jupyter notebook, am I right?
if not where should I write the codes ?
and if I want to use another dataset (my own data set) how can I use in the code?
and how can I see the result, for example the accuracy percentage?
I am really sorry for my simple questions! I have written a lot of code in “Matlab” but I am really a beginner in Python and Anaconda, my teacher force me to use Python and keras for my project.

thank you very much for your help

Reply
- Jason Brownlee January 28, 2018 at 8:22 am #
  
  A notebook is fine.
  
  You can write code in a Python script and then run the script directly.
  
  Reply
Atefeh January 28, 2018 at 12:01 am #

Hello Mr.Janson again
I wrote the code below from your Free mini course for hand written digit recognition, but after running I faced the syntaxerror:

from keras.datasets import mnist
…
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)

from keras.utils import np_utils
…
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)

model = Sequential()
model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28),
activation=’relu’))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation=’relu’))
model.add(Dense(num_classes, activation=’softmax’))
model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

File “”, line 2
2 model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28),
^
SyntaxError: invalid syntax

would you please help me?!

thanks a lot

Reply
- Jason Brownlee January 28, 2018 at 8:25 am #
  
  This:
  
  model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28), activation=’relu’))
  
  1
  2
  
  model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28),
  activation=’relu’))
  
  should be:
  
  model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28), activation=’relu’))
  
  1
  
  model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28), activation=’relu’))
  
  Reply
Lila January 29, 2018 at 8:04 am #

Thank you for the awsome blog and explanations. I have just a question: How can we get predicted values by the model. . Many thanks

Reply
- Jason Brownlee January 29, 2018 at 8:21 am #
  
  As follows:
  
  X = ... yhat = model.predict(X)
  
  1
  2
  
  X = ...
  yhat = model.predict(X)
  
  Reply
  - Lila January 30, 2018 at 1:22 am #
    
    Thank you for your prompt answer. I am trying to learn how keras models work and I used. I trained the model like this:
    
    model.compile(loss=’mean_squared_error’, optimizer=’sgd’, metrics=[‘MSE’])
    
    As output I have those lines
    
    Epoch 10000/10000
    
    10/200 [>………………………..] – ETA: 0s – loss: 0.2489 – mean_squared_error: 0.2489
    200/200 [==============================] – 0s 56us/step – loss: 0.2652 – mean_squared_error: 0.2652
    
    and my question what the difference between the two lines (MSE values)
    
    Reply
    - Jason Brownlee January 30, 2018 at 9:53 am #
      
      They should be the same thing. One may be calculated at the end of each batch, and one at the end of each epoch.
      
      Reply
Atefeh January 30, 2018 at 4:28 am #

hello

after running again it show an error:

NameError Traceback (most recent call last)
in ()
—-> 1 model = Sequential()
2 model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28), activation=’relu’))
3 model.add(MaxPooling2D(pool_size=(2, 2)))
4 model.add(Flatten())
5 model.add(Dense(128, activation=’relu’))

NameError: name ‘Sequential’ is not defined

Reply
- Jason Brownlee January 30, 2018 at 9:55 am #
  
  You are missing the imports. Ensure you copy all code from the complete example at the end.
  
  Reply
Atefeh January 31, 2018 at 1:02 am #

from keras.datasets import mnist
…
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
from keras.utils import np_utils
…
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)

model = Sequential()
2 model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28), activation=’relu’))
3 model.add(MaxPooling2D(pool_size=(2, 2)))
4 model.add(Flatten())
5 model.add(Dense(128, activation=’relu’))
6 model.add(Dense(num_classes, activation=’softmax’))
7 model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

Reply
Atefeh February 2, 2018 at 5:01 am #

hello
please tell me how can I find out that tensorflow and keras are correctly installed on my system.
maybe the problem is that, because no code runs in my jupyter. and no “import” acts well(for example import pandas)
thank you

Reply
- Jason Brownlee February 2, 2018 at 8:23 am #
  
  See this post:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  Reply
Dan February 3, 2018 at 12:29 am #

Hi. I’m totally new to machine learning and I’m trying to wrap my head around it.
I have a problem I can’t quite solve yet. And don’t know where to start actually.
I have a dictionary with a few key:value pairs. The key is a random 4 digit number from 0000 to 9999. And the value for each key is set as follows: if a digit in a number is either 0, 6 or 9 then its weight is 1, if a digit is 8 then it’s weight is 2, any other digit has a weight of 0. All the weights are summarised then and here you have the value for the key. (example: { ‘0000’: 4, ‘1234’: 0, ‘1692’: 2, ‘8800’: 6} – and so on).

Now I’m trying to build a model that will predict the correct value of a given key. (i.e if I give it 2222 the answer is 0, if I give it 9011 – it’s 2). What I did first is created a CSV file with 5 columns, first four is a split (by a single digit) key from my dictionary, and the fifth column is the value for each key. Next I created a dataset and defined a model (like this tutorial but with input_dim=4). Now when I train the model the accuracy won’t go higher then ~30%. Also your model is based on binary output, whereas mine should have an integer from 0 to 8. Where do I go from here?

Thank you for all your effort in advance! 🙂

Reply
- Jason Brownlee February 3, 2018 at 8:42 am #
  
  This post might help you nail down your problem as a predictive modeling problem:
  https://machinelearningmastery.com/how-to-define-your-machine-learning-problem/
  
  Reply
Alex February 5, 2018 at 5:22 am #

There is one thing I just dont get.

An example of row data is 6,148,72,35,0,33.6,0.627,50,1

I guess the number at the end is if the person has diabetes (1) or does not (0) , but what I dont understand is how I know the ‘prediction’is about that 0 or 1, tehere are a lot of other variables in the data, and I dont see ‘diabetes’ being a label for any of that.

So, how do I know or how do I set wich variable (number) I want to predict?

Reply
- Jason Brownlee February 5, 2018 at 7:49 am #
  
  You interpret the prediction in your application or usage.
  
  The model does not care what the inputs and outputs are, it does the best it can. It does not intrinsically care about diabetes.
  
  Reply
blaisexen February 6, 2018 at 9:14 am #

hi,
@Jason Brownlee, Master of Keras Python.

I’m developing a face recognition testing, I successfully used Rprop, it was good for static images or face pictures, I also have test svm results.

What do you think in your experienced that Keras is better or powerful than Rprop?

because I was also thinking to used Keras(1:1) for final result of Rprop(1:many).

or which do you think is better system?

thanks in advance for the advices.

I also heard one of the leader of commercial face recognizers uses PNN(uses libopenblas), so I really doubt which one to choose for my final thesis and application.

Reply
- Jason Brownlee February 6, 2018 at 9:29 am #
  
  What do you mean by rprop? I believe it is just an optimization algorithm, whereas Keras is a deep learning library.
  https://en.wikipedia.org/wiki/Rprop
  
  Reply
  - blaisexen February 17, 2018 at 10:46 am #
    
    Ok, I think I understand you.
    
    I used Accord.Net
    Rprop testing was good
    MLR testing was good
    SVM testing was good
    RBM testing was good
    
    I used classification for face images
    They are only good for static face pictures 100×100
    
    but if I used another picture from them,
    these 4 testing I have failed.
    
    Do you think if I used Keras in image face recognition will have a good result or good prediction?
    
    because if Keras will have a good result then I’ll have to used cesarsouza keras c#
    https://github.com/cesarsouza/keras-sharp
    
    thanks for the reply.
    
    Reply
    - Jason Brownlee February 18, 2018 at 6:45 am #
      
      Try it and see.
      
      Reply
CHIRANJEEVI February 8, 2018 at 8:52 pm #

What is the difference between the accuracy we get when we fit the model and the accuracy_score() of sklearn.metrics , what they mean exactly ?

Reply
- Jason Brownlee February 9, 2018 at 9:05 am #
  
  Accuracy is a summary of the number of predictions that were made correctly out of all predictions that were made.
  
  It is used as an estimate of model skill on new out of sample data.
  
  Reply
Shinan February 8, 2018 at 9:09 pm #

is weather forecasting can done using RNN?

Reply
- Jason Brownlee February 9, 2018 at 9:06 am #
  
  No. Weather forecasting is done with ensembles of physics simulations on very large computers.
  
  Reply
CHIRANJEEVI February 9, 2018 at 3:56 pm #

we haven’t predicting anyting during the fit (its just a training , like mapping F(x)=Y)
but still getting acc , what is this acc?

Epoch 1/150
768/768 [==============================] – 1s 1ms/step – loss: 0.6771 – acc: 0.6510

Thank you in advance

Reply
- Jason Brownlee February 10, 2018 at 8:50 am #
  
  Predictions are made as part of back propagating error.
  
  Reply
lcy1031 February 12, 2018 at 1:00 pm #

Hi Jason,

Many thanks to you for a great tutorial. I have couple questions to you as followings.
1). How can I get the score of Prediction?
2). How can I output the result of predict run to a file in which the output is listed by vertical?

I see you everywhere to answer questions and help people. Your time and patience were greatly appreciated!

Charles

Reply
- Jason Brownlee February 12, 2018 at 2:50 pm #
  
  You can make predictions with a model as follows:
  
  yhat = model.predict(X)
  
  You can then save the numpy array result to file.
  
  Reply
Callum February 21, 2018 at 10:11 am #

Hi I’ve just finished this tutorial but the only problem is what are we actually finding in the results as in what do accuracy and loss mean and what we are actually finding out.

I’m really new to the whole neural networks thing and don’t really understand them yet, I’d be very grateful if you’re able to reply

Many Thanks

Callum

Reply
- Jason Brownlee February 22, 2018 at 11:12 am #
  
  Accuracy is the model skill in terms of the number of correct predictions divided by the total number of predictions.
  
  Loss the function that the network is optimising, something differentiable and relatable to the metric of interest for the model, in this case logarithmic loss used for classification.
  
  Reply
Pedro Wenner February 23, 2018 at 1:27 am #

Hi Jason,

First of all congratulations for your awesome work, I finally got the hang of ML (hopefully, haha).
So, testing some changes in the number of neurons and batch size/epochs, I achieved 99.87% of accuracy.

The parameters I used were:

# create model
model = Sequential()
model.add(Dense(240, input_dim=8, init=’uniform’, activation=’relu’))
model.add(Dense(160, init=’uniform’, activation=’relu’))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))
# Compile model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# Fit the model
model.fit(X, Y, epochs=1500, batch_size=100, verbose=2)

And when I run it, I always get 99,87% of accuracy, which I think it’s a good thing, right? Please tell me if I did something wrong or if this is a false positive.

Thank you in advance and sorry for the bad english 😉

Reply
- Jason Brownlee February 23, 2018 at 12:00 pm #
  
  that accuracy is great, there will always be some error.
  
  Reply
Shiny March 2, 2018 at 12:56 am #

The above example is very good sir, I want to do price change prediction of electronics in online shopping project. Can you give any suggestions about my project. You had any example of price prediction using neural network please send a link sir.

Reply
- Jason Brownlee March 2, 2018 at 5:33 am #
  
  I would recommend following this process:
  https://machinelearningmastery.com/start-here/#process
  
  Reply
awaludin March 6, 2018 at 12:38 am #

Hi, very helpful example. But I still don’t understand why you load
X = dataset[:,0:8]
Y = dataset[:,8]
If I do
X = dataset[:,0:7] it won’t work

Reply
- Jason Brownlee March 6, 2018 at 6:16 am #
  
  You can learn more about indexing and slicing numpy arrays here:
  https://machinelearningmastery.com/index-slice-reshape-numpy-arrays-machine-learning-python/
  
  Reply
Jeong Kim March 8, 2018 at 1:48 pm #

Thank you for the tutorial.
Perhaps, someone already told you this. The data set is no longer available.

Reply
- Jason Brownlee March 8, 2018 at 2:55 pm #
  
  Thanks for the note, I’ll fix that up ASAP.
  
  Reply
Wesley Campbell March 9, 2018 at 1:24 am #

Thanks very much for the concise example! As an “interested amateur” with more experience coding for scientific data manipulation than for software development, a simple, high-level explanation like this one is much appreciated. I find sometimes that documentation pages can be a bit low-level for my liking, even with coding experience multiple languages. This article was all I needed to get started, and was much more helpful than other “official tutorials.”

Reply
- Jason Brownlee March 9, 2018 at 6:24 am #
  
  Thanks, I’m glad to hear that Wesley.
  
  Reply
Trung March 10, 2018 at 12:55 am #

Thank you for your tutorial, but the data set is not accessible. Could you please fix it.

Reply
- Jason Brownlee March 10, 2018 at 6:33 am #
  
  Thanks, I’ll fix it.
  
  Reply
atefeh March 16, 2018 at 10:11 pm #

hello

I have found a code to converting my image data to mnist format . but I face to an error below.
would you please help me?

import os
from PIL import Image
from array import *
from random import shuffle

# Load from and save to
Names = [[‘./training-images’,’train’], [‘./test-images’,’test’]]

for name in Names:

data_image = array(‘B’)
data_label = array(‘B’)

FileList = []
for dirname in os.listdir(name[0])[1:]: # [1:] Excludes .DS_Store from Mac OS
path = os.path.join(name[0],dirname)
for filename in os.listdir(path):
if filename.endswith(“.png”):
FileList.append(os.path.join(name[0],dirname,filename))

shuffle(FileList) # Usefull for further segmenting the validation set

for filename in FileList:

label = int(filename.split(‘/’)[2])

Im = Image.open(filename)

pixel = Im.load()

width, height = Im.size

for x in range(0,width):
for y in range(0,height):
data_image.append(pixel[y,x])

data_label.append(label) # labels start (one unsigned byte each)

hexval = “{0:#0{1}x}”.format(len(FileList),6) # number of files in HEX

# header for label array

header = array(‘B’)
header.extend([0,0,8,1,0,0])
header.append(int(‘0x’+hexval[2:][:2],16))
header.append(int(‘0x’+hexval[2:][2:],16))

data_label = header + data_label

# additional header for images array

if max([width,height]) <= 256:
header.extend([0,0,0,width,0,0,0,height])
else:
raise ValueError('Image exceeds maximum size: 256×256 pixels');

header[3] = 3 # Changing MSB for image data (0x00000803)

data_image = header + data_image

output_file = open(name[1]+'-images-idx3-ubyte', 'wb')
data_image.tofile(output_file)
output_file.close()

output_file = open(name[1]+'-labels-idx1-ubyte', 'wb')
data_label.tofile(output_file)
output_file.close()

# gzip resulting files

for name in Names:
os.system('gzip '+name[1]+'-images-idx3-ubyte')
os.system('gzip '+name[1]+'-labels-idx1-ubyte')

FileNotFoundError Traceback (most recent call last)
in ()
13
14 FileList = []
—> 15 for dirname in os.listdir(name[0])[1:]: # [1:] Excludes .DS_Store from Mac OS
16 path = os.path.join(name[0],dirname)
17 for filename in os.listdir(path):

FileNotFoundError: [WinError 3] The system cannot find the path specified: ‘./training-images’

Reply
- Jason Brownlee March 17, 2018 at 8:37 am #
  
  Looks like the code cannot find your images. Perhaps change the path in the code?
  
  Reply
Sayan March 17, 2018 at 4:57 pm #

Thanks a lot sir, this was a very good and intuitive tutorial

Reply
- Jason Brownlee March 18, 2018 at 6:01 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Nikhil Gupta March 19, 2018 at 11:12 pm #

I got a prediction model running successfully for fraud detection. My dataset is over 50 million and growing. I am seeing a peculiar issue.
When the loaded data is 10million or less, My prediction is OK.
As soon as I load 11 million data, My prediction saturates to a particular (say 0.48) and keeps on repeating. That is all predictions will be 0.48, irrespective of the input.

I have tried will multiple combinations of the dense model.
# create model
model = Sequential()
model.add(Dense(32, input_dim=4, activation=’tanh’))
model.add(Dense(28, activation=’tanh’))
model.add(Dense(24, activation=’tanh’))
model.add(Dense(20, activation=’tanh’))
model.add(Dense(16, activation=’tanh’))
model.add(Dense(12, activation=’tanh’))
model.add(Dense(8, activation=’tanh’))
model.add(Dense(1, activation=’sigmoid’))

Reply
- Jason Brownlee March 20, 2018 at 6:21 am #
  
  Perhaps check whether you need to train on all data, often a small sample is sufficient.
  
  Reply
  - Nikhil Gupta March 22, 2018 at 2:45 am #
    
    Oh. I believe that the machine learning accuracy will improve as we get more data over time.
    
    Reply
Chandra Sutrisno Tjhong March 28, 2018 at 4:43 pm #

HI,

How do you define number of hidden layers and neurons per layer?

Reply
- Jason Brownlee March 29, 2018 at 6:30 am #
  
  There are no good heuristics, trial and error is a good approach. Discover what works best for your specific data.
  
  Reply
Aravind March 30, 2018 at 12:12 am #

I executed the code and got the output, but how to use this prediction in the application.

Reply
- Jason Brownlee March 30, 2018 at 6:39 am #
  
  Depends on the application.
  
  Reply
Sabarish March 30, 2018 at 12:16 am #

What does the value 1.0 and 0..0 signifies??

Reply
- Jason Brownlee March 30, 2018 at 6:39 am #
  
  In what context?
  
  Reply
Anand April 1, 2018 at 3:51 pm #

If number of inputs are 8 then why did you use 12 neurons in input layer ? Moreover why is activation function used in input layer ?

Reply
- Jason Brownlee April 2, 2018 at 5:19 am #
  
  The number of neurons in the first hidden layer can be different to the number of neurons in the input layer (e.g. number of input features). They are only loosely related.
  
  Reply
Lia April 1, 2018 at 11:49 pm #

Hello Sir,
Does the neural network use a standardized independent variable values, or should we feed it with standardized ones in the fitting and predicting stages. Thanks

Reply
- Jason Brownlee April 2, 2018 at 5:23 am #
  
  Try both and see what works best for your specific predictive modeling problem.
  
  Reply
  - Mark Littlewood October 27, 2021 at 9:18 am #
    
    Hi I was playing with a 2 input data set and when I had the first layer set at Dense(4 it only output NaN for the loss. However when I reduced this to 3 I got meaningful loss output. Is there something about the maximum Dens value in relation to the inputs that causes this ?
    
    Reply
    - Adrian Tam October 27, 2021 at 12:56 pm #
      
      There should not be. It is more likely due to how the layers are initialized than number of neurons in the Dense layer.
      
      Reply
tareknahool April 4, 2018 at 5:17 am #

you always fantastic, it’s a great lesson. But, frankly I don’t know what is the meaning of
“\n%s: %.2f%%” % and why you used the number(1)in that code(model.metrics_names[1], scores[1]*100))

Reply
- Jason Brownlee April 4, 2018 at 6:19 am #
  
  This is Python string formatting:
  https://pyformat.info/
  
  Reply
Abhilash Menon April 5, 2018 at 6:27 am #

Dr. Brownlee,

When we predict, is it possible to have the predictions for each row in the test data set right next to it in the same row. I thought of printing predictions and then copying it in excel but I am not sure if Keras preserves order. Could you please help me out with this issue? Thanks so much for all your help!

Reply
- Jason Brownlee April 5, 2018 at 3:05 pm #
  
  Yes, the order of predictions matches the order of input values.
  
  Does that help?
  
  Reply
Andrea Grandi April 9, 2018 at 6:37 am #

Is Deep Learning some kind of “black magic” 🙂 ?

I had previously used scikit-learn and Machine Learning for the same dataset, trying to apply all the techniques I did learn both here and on books, to get a 76% accuracy.

I tried this Keras tutorial, using TensorFlow as backend and I’m getting 80% accuracy at first try O_o

Reply
- Jason Brownlee April 10, 2018 at 6:08 am #
  
  No, not magic, just different.
  
  Well done though!
  
  Reply
Manny Corrao April 11, 2018 at 8:30 am #

Can you tell us the column names? I think that is important because it helps us understand what the network is evaluating and learning about.

Thanks,

Manny

Reply
- Jason Brownlee April 11, 2018 at 4:11 pm #
  
  Yes, they are listed here:
  https://github.com/jbrownlee/Datasets/blob/master/pima-indians-diabetes.names
  
  Reply
rachit April 11, 2018 at 7:13 pm #

While Executing versions.py

i am getting this error

Traceback (most recent call last):
File “versions.py”, line 2, in
import scipy
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\scipy\__init__.py”, line 61, in
from numpy import show_config as show_numpy_config
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\__init__.py”, line 142, in
from . import add_newdocs
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\add_newdocs.py”, line 13, in
from numpy.lib import add_newdoc
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\lib\__init__.py”, line 8, in
from .type_check import *
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\lib\type_check.py”, line 11, in
import numpy.core.numeric as _nx
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\core\__init__.py”, line 74, in
from numpy.testing import _numpy_tester
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\testing\__init__.py”, line 12, in
from . import decorators as dec
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\testing\decorators.py”, line 6, in
from .nose_tools.decorators import *
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\testing\nose_tools\decorators.py”, line 20, in
from .utils import SkipTest, assert_warns
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\numpy\testing\nose_tools\utils.py”, line 15, in
from tempfile import mkdtemp, mkstemp
File “C:\Users\ATIT GARG\Anaconda3\lib\tempfile.py”, line 45, in
from random import Random as _Random
File “C:\Users\ATIT GARG\random.py”, line 7, in
from keras.models import Sequential
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\keras\__init__.py”, line 3, in
from . import utils
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\keras\utils\__init__.py”, line 4, in
from . import data_utils
File “C:\Users\ATIT GARG\Anaconda3\lib\site-packages\keras\utils\data_utils.py”, line 23, in
from six.moves.urllib.error import HTTPError
ImportError: cannot import name ‘HTTPError’

Reply
- Jason Brownlee April 12, 2018 at 8:35 am #
  
  Perhaps you need to update your environment?
  
  Reply
Gray April 14, 2018 at 4:25 am #

Jason – very impressive work! Even more impressive is your detailed answer to every question. I went through them all and got a lot of useful information. Great job!

Reply
- Jason Brownlee April 14, 2018 at 6:50 am #
  
  Thanks Gray!
  
  Reply
octdes April 14, 2018 at 2:39 pm #

Hello Jason,
Thank’s for the good tuto !
How would you name/describe the structure of this neuronal network ?
The point is that i find strange that you can have a different nmber of input and of neurones in the input layer. Most of the neuronal network diagramm i have seen, each input is directly connected with one neurone of the input layer. I have never seen a neuronal network diagramm where the number of input is different with the number of neurones in the input layer.
Do you have counterexample or do there is something i understand wrong ?
Thank you for your work and sharing your knowledge 🙂

Reply
- Jason Brownlee April 15, 2018 at 6:24 am #
  
  The type of neural network in this post is a multi-layer perceptron or an MLP for short.
  
  The first “layer” in the code actually defines both the input layer and the first hidden layer at the same time.
  
  The number of inputs must match the number of columns in the input data. The number of neurons in the first hidden layer can be anything you want.
  
  Does that help?
  
  Reply
Ashley April 16, 2018 at 7:29 am #

Thank you VERY much for this tutorial, Jason! It is the best I have found on the internet. As a political scientist pursuing complex outcomes like this one, I was looking for models that allow for more complicated relationships. Your code and post are so clearly articulated; I was able to adapt it for my purposes more easily than I thought would be possible. One possible extension of your work, and possibly this tutorial, would be to map the layers and nodes onto a theory of the data generating process.

Reply
- Jason Brownlee April 16, 2018 at 2:54 pm #
  
  Thanks Ashley, I’m glad it helped.
  
  Thanks for the suggestion.
  
  Reply
Eric Miles April 20, 2018 at 1:22 am #

I’m just starting out working through your site – thanks for the great resource! I wanted to point out what I think is a typo: in the code block just before Section 2 “Define Model” I believe we just want X = dataset[:,0:7] so that we don’t include the output variables in our inputs.

Reply
- Jason Brownlee April 20, 2018 at 6:00 am #
  
  No, it is correct Eric.
  
  X will have 8 columns (0-7), the original dataset has 9.
  
  You can learn more about array slicing and ranges in Python here:
  https://machinelearningmastery.com/index-slice-reshape-numpy-arrays-machine-learning-python/
  
  Reply
Rafa April 28, 2018 at 12:50 am #

Great tutorial, finally I have found a good web about deep learning (Y)

Reply
- Jason Brownlee April 28, 2018 at 5:31 am #
  
  Thanks.
  
  Reply
Vivek May 7, 2018 at 8:31 pm #

Great tutorial thank for help. I have one project in which i have to do CAD images(basically 3-d mechanical image classification). can you please give road map how can i proceed?
I am new and i dont have any idea

Reply
- Jason Brownlee May 8, 2018 at 6:12 am #
  
  This is my general roadmap for a predictive modeling problem:
  https://machinelearningmastery.com/start-here/#process
  
  Reply
  - Vivek May 9, 2018 at 10:03 pm #
    
    Thanks a lot sir. This will help me to proceed
    
    Reply
    - Jason Brownlee May 10, 2018 at 6:31 am #
      
      I’m glad to hear that.
      
      Reply
Rahmad ars May 8, 2018 at 1:36 am #

Thanks sir for the tutorial.
Actually i still have some question:
1. Is this backpropagation neural network?
2. How to initialize nguyen-widrow random weights
3. I have my own dataset, each consist of 1×64 matrix, which is the correct one? I normalize each column of it, or each row of it?

Thanks.
Im the one who asked u in backpropagation from scratch page

Reply
- Jason Brownlee May 8, 2018 at 6:16 am #
  
  Yes, it uses backpropgation to update the weights.
  
  Sorry, I don’t know about that initialization method, you can see the supported methods here:
  https://keras.io/initializers/
  
  Try a suite of data preparation schemes to see what works best for your specific dataset and chosen model.
  
  Reply
Hussein May 9, 2018 at 10:33 pm #

Hi Jason,

This is a very nice intro to a daunting but intriguing technology! I wanted to play around with your code and see if I could come up with some simple dataset and see how the predictions will work out – one idea that occurred to me is, can I make a model that predicts what country a telephone number belongs to. So the training dataset looks like a 2 column CSV, phone number and country…that’s basically one feature. Do you think this would be effective at all? What other features could be added here? I’ll still give this a shot, but would appreciate any thoughts/ideas!

Thanks!

Reply
- Jason Brownlee May 10, 2018 at 6:33 am #
  
  The country code would make it too simple a problem – e.g. it can be solved with a look-up table.
  
  Reply
  - Hussein May 10, 2018 at 4:24 pm #
    
    True, I just wanted to see if machine learning could be used to “figure out” the lookup table as opposed to be provided with one by the user, given enough data..not a practical use-case, but as a learning exercise. As it turns out, my data-set of about 700 phone numbers wasn’t effective for this. But again, is this because the problem had too few features, i.e in my case, just one? What if I increased the number of features, say phone number, country code, city the phone number belongs to, maybe even the cellphone company the number is registered to, do you think that would make the training more effective?
    
    Reply
    - Jason Brownlee May 11, 2018 at 6:33 am #
      
      If you can write an if statement or use a look-up table to solve the problem, then it might be a bad fit for machine learning.
      
      This post will help you frame your problem:
      https://machinelearningmastery.com/how-to-define-your-machine-learning-problem/
      
      Reply
      - Hussein May 11, 2018 at 5:15 pm #
        
        Thanks Jason for that resource. I’ll check it out. I also came across this (https://elitedatascience.com/machine-learning-projects-for-beginners) that I’m reading through, for anyone else that’s looking for a small ML problem to solve as a learning experience.
      - Jason Brownlee May 12, 2018 at 6:27 am #
        
        Great.
Frank Lu May 14, 2018 at 7:44 pm #

Great tutorial very helpful ,then I have a question .Which accounted for the largest proportion in 8 inputs? We have 8 factors in the dataset like pregnancies, glucose, bloodpressure and the others. So , Which factor is most related to diabetes used? How do we know this proportion through MLP?
Thanks！

Reply
- Jason Brownlee May 15, 2018 at 7:53 am #
  
  We might not know. This is the difference between descriptive and predictive models.
  
  This is really the issue of model interpretability, I write more about it here:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-interpret-the-predictions-from-my-model
  
  Reply
Paolo May 16, 2018 at 7:59 pm #

Hi Jason,
thanks for your tutorials.

I have a question, do you use keras with pandas too? In this case, it is better to import data wih numpy anyway? What do you suggest?

Thank you again,
Paolo

Reply
- Jason Brownlee May 17, 2018 at 6:31 am #
  
  Yes, and yes.
  
  Reply
  - Stefan November 10, 2018 at 1:06 am #
    
    How so? I usually see pandas.readcsv() to read files. Does keras only accept numpy arrays?
    
    Reply
    - Jason Brownlee November 10, 2018 at 6:07 am #
      
      Correct.
      
      Reply
zohreh May 20, 2018 at 9:14 am #

Thanks for your great tutorial. I have a credit card dataset and I want to do fraud detection on it. it has 312 columns, So before doing DNN, I should do dimension reduction, then using DNN? and another question is that Is it possible to do CNN on my dataset as well?

Thank you

Reply
- Jason Brownlee May 21, 2018 at 6:24 am #
  
  Yes, choose the features that best map to the output variable.
  
  A CNN can be used if there is a spatial relationship in the data, such as a sequence of transactions over space or time.
  
  Reply
  - zohreh May 23, 2018 at 6:44 am #
    
    Thanks for your answer, So I think CNN doesn’t make sense for my dataset,
    Do you have any tutorial for active learning?
    thanks for your time.
    
    Reply
    - Jason Brownlee May 23, 2018 at 2:37 pm #
      
      I don’t know if it is appropriate, I was trying to provide enough information for you to make that call.
      
      I hope to cover active learning in the future.
      
      Reply
      - zohreh May 24, 2018 at 3:13 am #
        
        yes I understand, I said according to your provided information, thank you so much for your answers and great tutorials.
Miguel García May 24, 2018 at 11:55 am #

Can you share a tutorial for first neural netowrk with multilabel support?

Reply
- Jason Brownlee May 24, 2018 at 1:51 pm #
  
  Thanks for the suggestion.
  
  Reply
Sathish May 24, 2018 at 12:57 pm #

how to create convolutional layers and visualize features in keras

Reply
- Jason Brownlee May 24, 2018 at 1:51 pm #
  
  Good question, sorry, I don’t have a worked example.
  
  Reply
Anam May 28, 2018 at 3:52 am #

Dear Jason,
I get an error”ValueError: could not convert string to float: “Kindly help to solve the issue.And I am using my own dataset which consist of text not numbers(like the dataset you have used).
Thanks!

Reply
- Jason Brownlee May 28, 2018 at 6:04 am #
  
  This might give you some ideas:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Anam May 29, 2018 at 7:26 am #

Dear Jason,
I am running your code example from section 6.But I get an error in the following code snippet:

Code Snippet:
dataset = numpy.loadtxt(“pima_indians.csv”, delimiter=”,”)
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

Error:
ValueError: could not convert string to float: “6

Kindly guide me to solve the issue. Thanks for your precious time.

Reply
- Jason Brownlee May 29, 2018 at 2:49 pm #
  
  I’m sorry to hear that, I have some suggestions here:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
- Gautam Sharma June 19, 2018 at 1:20 am #
  
  Did you find any solution as I am getting the same error?
  
  Reply
moti June 4, 2018 at 3:34 am #

Hi Doctor, in this python code where shall I get the “keras” package?

Reply
- Jason Brownlee June 4, 2018 at 6:34 am #
  
  This tutorial shows you how to install Keras:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  Reply
Ammara Habib June 5, 2018 at 5:13 am #

Hy jason, Thanks for an amazing post. I have a question here that can we use dense layer as input for text classification(e.g : sentiment classification of movie reviews).If yes than how can we convert the text dataset into numeric for dense layer.

Reply
- Jason Brownlee June 5, 2018 at 6:47 am #
  
  You can, although it is common to one hot encode the text or use an embedding layer.
  
  I have examples of both on the blog.
  
  Reply
Ammara Habib June 5, 2018 at 9:18 am #

Thanks for your precious time.Sir, you mean that first i use embedding layer as input layer and then i use dense layer as the hidden layer?

Reply
- Jason Brownlee June 5, 2018 at 3:05 pm #
  
  Yes.
  
  Reply
Lisa Xie June 15, 2018 at 1:12 pm #

Hi,thanks for your tutorial. I am wondering how you set the number neurons and activation functions for each layer, eg. 12 neurons for the 1st layer and 8 for the second.

Reply
- Jason Brownlee June 15, 2018 at 2:50 pm #
  
  I used a little trial and error.
  
  Reply
Marwa June 18, 2018 at 1:25 am #

Hi jason,

I developped two neural networks using keras but I have this error:

line 1336, in _do_call
raise type(e)(node_def, op, message)

ResourceExhaustedError: OOM when allocating tensor with shape[7082368,50]
[[Node: training_1/Adam/Variable_14/Assign = Assign[T=DT_FLOAT, _class=[“loc:@training_1/Adam/Variable_14″], use_locking=true, validate_shape=true, _device=”/job:localhost/replica:0/task:0/device:GPU:0”](training_1/Adam/Variable_14, training_1/Adam/zeros_14)]]

Have you an idea?
Thanks.

Reply
- Jason Brownlee June 18, 2018 at 6:42 am #
  
  Sorry, I have not seen this error before. Perhaps try posting/searching on stackoverflow?
  
  Reply
prateek bhadauria June 23, 2018 at 11:38 pm #

sir i have a regression related dataset which contains an array of 49999 rows and 20 coloumns , i want to implement CNN on this dataset ,

i put my code as per my perception kindly give me suggestion , to correct it i was stuck mainly by putting my dense dimension specially

from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import tensorflow as tf
from matplotlib import pyplot
from sklearn.datasets import make_regression
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.preprocessing import StandardScaler
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD

seed = 7
np.random.seed(seed)
from scipy.io import loadmat
dataset = loadmat(‘matlab2.mat’)
Bx=basantix[:, 50001:99999]
Bx=np.transpose(Bx)
Fx=fx[:, 50001:99999]
Fx=np.transpose(Fx)

from sklearn.cross_validation import train_test_split
Bx_train, Bx_test, Fx_train, Fx_test = train_test_split(Bx, Fx, test_size=0.2, random_state=0)

scaler = StandardScaler() # Class is create as Scaler
scaler.fit(Bx_train) # Then object is created or to fit the data into it
Bx_train = scaler.transform(Bx_train)
Bx_test = scaler.transform(Bx_test)

model = Sequential()
def base_model():

keras.layers.Dense(Dense(49999, input_shape=(20,), activation=’relu’))
model.add(Dense(20))
model.add(Dense(49998, init=’normal’, activation=’relu’))
model.add(Dense(49998, init=’normal’))
model.compile(loss=’mean_squared_error’, optimizer = ‘adam’)
return model

scale = StandardScaler()
Bx = scale.fit_transform(Bx)
Bx = scale.fit_transform(Bx)

clf = KerasRegressor(build_fn=base_model, nb_epoch=100, batch_size=5,verbose=0)

clf.fit(Bx,Fx)
res = clf.predict(Bx)

## line below throws an error
clf.score(Fx,res)

Reply
- Jason Brownlee June 24, 2018 at 7:33 am #
  
  Sorry, I cannot debug your code for you. Perhaps post your code and error to stackoverflow?
  
  Reply
Madhav Prakash June 24, 2018 at 3:01 am #

Hi Jason,
Looking at the dataset, I could find that there were many attributes with each of them differing in terms of units. Why haven’t you rescaled/normalised the data? but still managed to get an accuracy of 75%?

Reply
- Jason Brownlee June 24, 2018 at 7:35 am #
  
  Ideally, we should rescale the data.
  
  The relu activation function is more flexible with unscaled data.
  
  Reply
  - Madhav Prakash June 24, 2018 at 4:23 pm #
    
    Ohkay, thanks.
    Also, I’ve implemented a NN on a database similar to this, where the accuracy varies b/w 70-75%. I’ve tried to increase the accuracy by tuning various parameters and functions (learning rate, no. of layers, neurons per level, earlystopping, activation fn, initialization, optimizer etc…) but it was not a success. My question is when do i come to know that i’ve reached the maximum accuracy possible for my implementation? Do i stay content with the current accuracy?
    
    Reply
    - Jason Brownlee June 25, 2018 at 6:19 am #
      
      When we run out of time or ideas.
      
      I list some more ideas here:
      https://machinelearningmastery.com/machine-learning-performance-improvement-cheat-sheet/
      
      And here:
      https://machinelearningmastery.com/improve-deep-learning-performance/
      
      Reply
Aarron Wilson July 8, 2018 at 8:19 am #

First of all thanks for the tutorial. Also I acknowledge that this network is more for educational purposes. Yet this network can be improved to 83-84% accuracy with standard normalization alone. Also it can hit 93-95% accuracy by using a deeper model.

#Standard normalization
X= StandardScaler().fit_transform(X)

#and a deeper model
model = Sequential()
model.add(Dense(12, input_dim=8, activation=’relu’))
model.add(Dense(12, activation=’relu’))
model.add(Dense(12, activation=’relu’))
model.add(Dense(12, activation=’relu’))
model.add(Dense(12, activation=’relu’))
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))

Reply
- Jason Brownlee July 9, 2018 at 6:30 am #
  
  Thanks, yes, normalization is a good idea in general when working with neural nets.
  
  Reply
Alex July 10, 2018 at 3:47 am #

Hi, thank you for this great article

Imagine that in my dataset instead of diabetes being a 0 or 1 I have 3 results, I mean, the data rows are like this

data1, data2, sickness
123, 124, 0
142, 541, 0
156, 418, 1
142, 541, 1
156, 418, 2

So, I need to categorize for 3 values, If I use this same example you gave us how can I determine the output?

Reply
- Jason Brownlee July 10, 2018 at 6:51 am #
  
  The output will be sickness Alex. Perhaps I don’t understand your question?
  
  Reply
  - Alex July 10, 2018 at 7:11 am #
    
    The output will be sickness yes
    
    Reply
Alex July 10, 2018 at 10:17 am #

Sorry for my English, it is not my natal tongue, I will re do my quesyion. What I mean is this, I will be having a label with more than 2 results, 0 is one sickness, 1 will be other and 2 will be other.

How can I use the model you showed us to fit the 3 results?

Reply
- Jason Brownlee July 10, 2018 at 2:26 pm #
  
  I see, this is called a multi-class classification problem.
  
  This tutorial will help:
  https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
  
  Reply
adsad July 11, 2018 at 1:06 am #

is it possible to predict the lottery outcome. if so how?

Reply
- Jason Brownlee July 11, 2018 at 5:59 am #
  
  No. I explain more here:
  https://machinelearningmastery.com/faq/single-faq/can-i-use-machine-learning-to-predict-the-lottery
  
  Reply
Tom July 14, 2018 at 2:32 am #

Hi Jason, I run your first example code in this tutorial. but what makes me confused is:

Why the final training accuracy (0.7656) is different from the evaluated scores (78.26%) in the same datasets (training set) ? I can’t figure it out. Can you tell me please? Thanks a lot!

Epoch 150/150
768/768 [==============================] – 0s – loss: 0.4827 – acc: 0.7656
32/768 [>………………………..] – ETA: 0s
acc: 78.26%

Reply
- Jason Brownlee July 14, 2018 at 6:20 am #
  
  One is the performance on the training set, the other on the validation set.
  
  You can learn more about the difference here:
  https://machinelearningmastery.com/difference-test-validation-datasets/
  
  Reply
Tom July 14, 2018 at 9:09 pm #

Thanks for the rapid reply. But I noticed that in your code the training set and validation set are exactly the same dataset. Please check it for confirmation. The code is in the part “6. Tie It All Together”.

# Fit the model
model.fit(X, Y, epochs=150, batch_size=10)
# evaluate the model
scores = model.evaluate(X, Y)

So, my problem is still the same: Why the final training accuracy (0.7656) is different from the evaluated scores (78.26%) in the same datasets?
Thanks!

Reply
- Jason Brownlee July 15, 2018 at 6:14 am #
  
  Perhaps verbose output might be accumulated over each batch rather than summarizing skill at the end of the training epoch.
  
  Reply
ami July 16, 2018 at 2:01 am #

Hello Jason,
Do you have some tutorial on signal processing using CNN ? I have csv files of some biomedical signals like ECG and i want to classify normal and abnormal signals using deep learning.

With Regards

Reply
- Jason Brownlee July 16, 2018 at 6:11 am #
  
  Yes, I have a suite of tutorials scheduled on this topic. They should be out soon.
  
  Reply
EL July 16, 2018 at 7:19 pm #

Hi, thank you so much for your tutorial. I am trying to make a neural network that will take a dataset and return if it is suitable to be analyzed by another program i have. Is it possible to feed this with acceptable datasets and unacceptable datasets and then call it on a new dataset and then return whether this dataset is acceptable? Thank you for your help, I am very new to machine learning.

Reply
- Jason Brownlee July 17, 2018 at 6:14 am #
  
  Try it and see how you go.
  
  Reply
ami July 18, 2018 at 2:37 pm #

Oh really ! Thank you so much. Can you please notify me when the tutorials will be out because i am doing a project and i am stuck right now.

With Regards

Reply
Diagrams July 30, 2018 at 2:45 pm #

It would be very very helpful for newcomers if you had a diagram of the network, showing individual nodes and graph edges (and bias nodes and activation functions), and indicating on it which parts were generated by which model.add commands/parameters. Similar to https://zhu45.org/posts/2017/May/25/draw-a-neural-network-through-graphviz/

I’ve tried visualizing it with from keras.utils.plot_model and tensorboard, but neither produce a node-level diagram.

Reply
- Jason Brownlee July 31, 2018 at 5:58 am #
  
  Thanks for the suggestion.
  
  Reply
Aravind July 30, 2018 at 7:57 pm #

can anyone tell a simple way to run my ann keras tensorflow backend in GPU. Thanks

Reply
- Jason Brownlee July 31, 2018 at 6:00 am #
  
  The simplest way I know how:
  https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/
  
  Reply
farli August 6, 2018 at 1:08 pm #

Did you use back propagation here?

Reply
- Jason Brownlee August 6, 2018 at 2:54 pm #
  
  Yes.
  
  Reply
  - farli August 13, 2018 at 9:40 am #
    
    Can you please make a tutorial on convolutional neural net? That would be really helpful ..:)
    
    Reply
    - Jason Brownlee August 13, 2018 at 2:27 pm #
      
      Yes, i have many on the blog already. Try the blog search.
      
      Reply
Karim Gamal August 7, 2018 at 8:52 pm #

I have a problem where I get the result as shown below

Epoch 146/150 – 0s – loss: -1.2037e+03 – acc: 0.0000e +00
Epoch 147/150 – 0s – loss: -1.2037e+03 – acc: 0.0000e +00
Epoch 148/150 – 0s – loss: -1.2037e+03 – acc: 0.0000e +00
Epoch 149/150 – 0s – loss: -1.2037e+03 – acc: 0.0000e +00
Epoch 150/150 – 0s – loss: -1.2037e+03 – acc: 0.0000e +00

where in my data set the output is a value between 0 to 500 not only 0 and 1
so how can I fix this in my code

Reply
- Jason Brownlee August 8, 2018 at 6:18 am #
  
  Sounds like a regression problem. Change the activation function in the output layer to linear and the loss function to ‘mse’.
  
  See this tutorial:
  https://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/
  
  Reply
Tim August 15, 2018 at 5:54 am #

AWESOME!!! Thanks so much for this.

Reply
- Jason Brownlee August 15, 2018 at 6:11 am #
  
  You’re welcome, I’m happy it helped.
  
  Reply
tania August 27, 2018 at 8:35 pm #

Hi Jason,

Thank you for the tutorial. I am relatively new to ML and I am currently working on a classification problem that is non binary.

My dataset consists of a number of labeled samples – all measuring the same quantity/unit. The amount typically ranges from 10 to 20 labeled samples/inputs. However, the feed forward or testing sample will only contain 7 of those inputs (at random).

I’m struggling to find a solution to designing a system that accepts fewer inputs than what is typically found in the training set.

Reply
- Jason Brownlee August 28, 2018 at 5:59 am #
  
  Perhaps try following this process:
  https://machinelearningmastery.com/start-here/#process
  
  Reply
Vaibhav Jaiswal September 10, 2018 at 6:28 pm #

Great tutorial there! But the main aspect of the model is to predict on a sample. If i print the first predicted value,it shows me some values for all the columns of categorical features. How to get the predicted number from the sample?

Reply
- Jason Brownlee September 11, 2018 at 6:26 am #
  
  The order of the predictions matches the order of the inputs.
  
  Reply
Glen September 19, 2018 at 10:45 pm #

I think I must be doing something wrong, I keep getting the error:
File “C:\Users\glens\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py”, line 519, in __exit__
c_api.TF_GetCode(self.status.status))

InvalidArgumentError: Input to reshape is a tensor with 10 values, but the requested shape has 1
[[Node: training_19/Adam/gradients/loss_21/dense_64_loss/Mean_1_grad/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _class=[“loc:@training_19/Adam/gradients/loss_21/dense_64_loss/Mean_1_grad/truediv”], _device=”/job:localhost/replica:0/task:0/device:GPU:0″](training_19/Adam/gradients/loss_21/dense_64_loss/mul_grad/Sum, training_19/Adam/gradients/loss_21/dense_64_loss/Mean_1_grad/DynamicStitch/_1703)]]

Are you able to shed any light on why I would get this error?

Thankyou

Reply
- Jason Brownlee September 20, 2018 at 7:59 am #
  
  I have not seen this error, I have some suggestions here:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Snehasish September 19, 2018 at 11:15 pm #

Hi Jason, thanks for this awesome tutorial. I have one doubt – why did the evaluation not produce 100% accuracy? After all, we used the same dataset for evaluation as the one used for training itself.

Reply
- Jason Brownlee September 20, 2018 at 8:00 am #
  
  Good question!
  
  We are approximating a challenging mapping function, not memorizing examples. As such, there will always be error.
  
  I explain more here:
  https://machinelearningmastery.com/faq/single-faq/why-cant-i-get-100-accuracy-or-zero-error-with-my-model
  
  Reply
Mark C September 27, 2018 at 12:49 am #

How do you predict something you want to predict such as new data. for example I did a spam detection but dont know how to predict whether a sentence i write is spam or not .

Reply
- Jason Brownlee September 27, 2018 at 6:01 am #
  
  You can call model.predict() with a finalized model. More here:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-make-predictions
  
  Reply
Vivek October 1, 2018 at 3:17 am #

Hello Sir,

I am new and understood some part of your code. I have question in prediction model basically we divide our data into training and test set. In the example above the entire dataset is used as training dataset. How can we train the model on training set use it for the prediction on test set?

Reply
- Jason Brownlee October 1, 2018 at 6:28 am #
  
  Great question, yes, train the model on all available data and then use it to start making predictions.
  
  More here:
  https://machinelearningmastery.com/train-final-machine-learning-model/
  
  Reply
Vivek35 October 1, 2018 at 7:11 am #

Hello Sir,
It’s great tutorial to understand. However, I am new and want to understand something out of it. In the above code we have treated entire dataset as training set. Can we divide this into training set and test set, apply model to training set and use it for test set prediction.How can we achieve with the above code?

Reply
- Jason Brownlee October 1, 2018 at 2:39 pm #
  
  Thanks.
  
  Yes, you can split the dataset manually or use scikit-learn to make the split for you. I explain more here:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-evaluate-a-machine-learning-algorithm
  
  Reply
Lipi October 5, 2018 at 6:26 am #

Hi Jason,

I am trying to predict using my neural network. I have used MinMaxScaler in the features while training the data. I don’t get a good prediction if I don’t use the same transform function on the prediction data set which I used on the features while training the data. Could you suggest me the correct approach in this situation?

Reply
- Jason Brownlee October 5, 2018 at 2:29 pm #
  
  You must use the same transform to both prepare training data and to make predictions on new data.
  
  Reply
  - Lipi October 5, 2018 at 10:12 pm #
    
    Thank you!
    
    Reply
neenu October 6, 2018 at 3:57 pm #

hi i am new to this i writew following code in spyder
from keras.models import Sequential
from keras.layers import Dense
import numpy
# fix random seed for reproducibility
numpy.random.seed(7)
# load pima indians dataset
dataset = numpy.loadtxt(“pima-indians-diabetes.txt”,encoding=”UTF8″, delimiter=”,”)
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

# create model
model = Sequential()
model.add(Dense(12, input_dim=8, activation=’relu’))
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))
# Compile model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# Fit the model
model.fit(X, Y, epochs=150, batch_size=10)
# evaluate the model
scores = model.evaluate(X, Y)
print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))

And i got this as output

runfile(‘C:/Users/DELL/Anaconda3/Scripts/temp.py’, wdir=’C:/Users/DELL/Anaconda3/Scripts’)
Using TensorFlow backend.
Traceback (most recent call last):

File “”, line 1, in
runfile(‘C:/Users/DELL/Anaconda3/Scripts/temp.py’, wdir=’C:/Users/DELL/Anaconda3/Scripts’)

File “C:\Users\DELL\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py”, line 668, in runfile
execfile(filename, namespace)

File “C:\Users\DELL\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py”, line 108, in execfile
exec(compile(f.read(), filename, ‘exec’), namespace)

File “C:/Users/DELL/Anaconda3/Scripts/temp.py”, line 1, in
from keras.models import Sequential

File “C:\Users\DELL\Anaconda3\lib\site-packages\keras\__init__.py”, line 3, in
from . import utils

File “C:\Users\DELL\Anaconda3\lib\site-packages\keras\utils\__init__.py”, line 6, in
from . import conv_utils

File “C:\Users\DELL\Anaconda3\lib\site-packages\keras\utils\conv_utils.py”, line 9, in
from .. import backend as K

File “C:\Users\DELL\Anaconda3\lib\site-packages\keras\backend\__init__.py”, line 89, in
from .tensorflow_backend import *

File “C:\Users\DELL\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py”, line 5, in
import tensorflow as tf

File “C:\Users\DELL\Anaconda3\lib\site-packages\tensorflow\__init__.py”, line 22, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import

File “C:\Users\DELL\Anaconda3\lib\site-packages\tensorflow\python\__init__.py”, line 49, in
from tensorflow.python import pywrap_tensorflow

File “C:\Users\DELL\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py”, line 74, in
raise ImportError(msg)

ImportError: Traceback (most recent call last):
File “C:\Users\DELL\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py”, line 14, in swig_import_helper
return importlib.import_module(mname)
File “C:\Users\DELL\Anaconda3\lib\importlib\__init__.py”, line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File “”, line 994, in _gcd_import
File “”, line 971, in _find_and_load
File “”, line 955, in _find_and_load_unlocked
File “”, line 658, in _load_unlocked
File “”, line 571, in module_from_spec
File “”, line 922, in create_module
File “”, line 219, in _call_with_frames_removed
ImportError: DLL load failed with error code -1073741795

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “C:\Users\DELL\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py”, line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File “C:\Users\DELL\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py”, line 17, in
_pywrap_tensorflow_internal = swig_import_helper()
File “C:\Users\DELL\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py”, line 16, in swig_import_helper
return importlib.import_module(‘_pywrap_tensorflow_internal’)
File “C:\Users\DELL\Anaconda3\lib\importlib\__init__.py”, line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named ‘_pywrap_tensorflow_internal’

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

Reply
- Jason Brownlee October 7, 2018 at 7:24 am #
  
  I recommend this tutorial to help you setup your environment:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  I recommend that you don’t use am IDE or notebook:
  https://machinelearningmastery.com/faq/single-faq/why-dont-use-or-recommend-notebooks
  
  Instead, I recommend you save code to a .py file and run from the command line:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-run-a-script-from-the-command-line
  
  Reply
kamal October 15, 2018 at 1:08 am #

sir please provide the python code for adaptive neuro fuzzy classifier

Reply
- Jason Brownlee October 15, 2018 at 7:31 am #
  
  Thanks for the suggestion.
  
  Reply
  - Rajan Kumar June 29, 2021 at 3:44 pm #
    
    I am waiting too for it.
    
    Reply
Shahbaz October 24, 2018 at 4:44 am #

blessed on u sir,
can u give me idea about OCR system, for my final year project, plz give me back-end stratigy for OCR , r u have any code on OCR

Reply
- Jason Brownlee October 24, 2018 at 6:32 am #
  
  Perhaps start here:
  https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
  
  Reply
Andrew Agib October 29, 2018 at 10:39 pm #

model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

show a syntax error on that sentence what could be the reason

Reply
- Jason Brownlee October 30, 2018 at 6:02 am #
  
  I have some suggestions here:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
VASUDEV K P November 3, 2018 at 10:13 pm #

Hello Jason,

I have the theano back end installed. I am using Windows OS and during execution I am getting an error “No module named TensorFlow”. Please help

Reply
- Jason Brownlee November 4, 2018 at 6:27 am #
  
  You may have to change the configuration of Keras to use Theano instead.
  
  More details here:
  https://keras.io/backend/
  
  Reply
Imen Drs November 4, 2018 at 7:09 am #

Hi Jason,
Please,how can we calculate the precision and recall of this example?
And thanks.

Reply
- Jason Brownlee November 5, 2018 at 6:06 am #
  
  You can use scikit-learn metrics:
  http://scikit-learn.org/stable/modules/classes.html#sklearn-metrics-metrics
  
  Reply
Stefan November 10, 2018 at 2:59 am #

I thought sigmoid and softmax were quite similar activation functions. But when trying the same model with softmax as activation for the last layer instead of sigmoid, my accuracy is much much worse.

Does that make sense to you? If so why? I feel like I see softmax more often in other code than sigmoid.

Reply
- Jason Brownlee November 10, 2018 at 6:09 am #
  
  Nope.
  
  Sigmoid for 2 classes.
  Softmax for >2 classes
  
  Reply
Amuda Kamorudeen November 10, 2018 at 4:46 pm #

I’m working on model that will predict propensity of customer that are likely to terminate their service with company. I have dataset of 70000 rows and 500 columns, Please how can I pass numeric data as an input to a convolutional neural network (CNN) .

Reply
- Jason Brownlee November 11, 2018 at 5:59 am #
  
  CNNs are only appropriate for data with a spatial relationship, such as images, time series and text.
  
  Reply
irfan November 18, 2018 at 3:22 pm #

hi jason,

i am using tensor flow as backend.
from keras.models import Sequential
from keras.layers import Dense
import sys
from keras import layers
from keras.utils import plot_model

print (model.layer())

erro.

—————————————————————————
AttributeError Traceback (most recent call last)
in
9 model.add(Dense(512, activation=’relu’))
10 model.add(Dense(10, activation=’sigmoid’))
—> 11 print (model.layer())
12 # Compile model
13 model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

AttributeError: ‘Sequential’ object has no attribute ‘layer’

Reply
- Jason Brownlee November 19, 2018 at 6:44 am #
  
  Why are you trying to print model.layer()?
  
  Reply
Mario December 2, 2018 at 5:30 am #

Hi Jason
First thanks for amazing tutorial , since your scripts are using list of values while my inputs are list of 24×20 matrices which are filled out by values in especial order how they measured for 3 parameters in 3000 cycles , how can I feed this type matrice-data or let’s say how can I feed stream of images for 3 different parameters I already extracted from raw dataset and after preprocessing I convert them to 24*20 matrices or .png images ? How should I change this script so that I can use my dataset?

Reply
- Jason Brownlee December 2, 2018 at 6:26 am #
  
  When using an MLP with images, you must flatten each matrix of pixel data to a single row vector.
  
  Reply
Evangelos Argyropoulos December 18, 2018 at 6:15 am #

Hi Jason,
Thank for tutorial. 1 questions.
I use the algorithm for time series prediction 0=buy 1=sell. Does this model overfit?

Reply
- Jason Brownlee December 18, 2018 at 6:27 am #
  
  You can only know if you try fitting it and evaluating learning curves on train and validation datasets.
  
  Reply
SOURAV MONDAL December 28, 2018 at 7:42 am #

Great tutorial Sir.
Is there a way to visualize different layers with their nodes and interconnections among them, of a model created in keras (i mean the basic structure of a neural network with layers of nodes and interconnections among them).

Reply
- Jason Brownlee December 29, 2018 at 5:46 am #
  
  Yes, check out this tutorial:
  https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/
  
  Reply
Imen Drs December 28, 2018 at 11:29 pm #

Thanks for this tutorial.

I have a problem when i try to compile and fit my model. It return value error : ValueError: could not convert string to float: ’24, 26, 99, 31, 623, 863, 77, 32, 362, 998, 1315, 33, 291, 14123, 39, 8, 335, 2308, 349, 403, 409, 1250, 417, 47, 1945, 50, 188, 51, 4493, 3343, 13419, 6107, 84, 18292, 339, 9655, 22498, 1871, 782, 1276, 2328, 56, 17633, 24004, 24236, 1901, 6112, 22506, 26397, 816, 502, 352, 24238, 18330, 7285, 2160, 220, 511, 17680, 68, 5137, 26398, 875, 542, 354, 2045, 555, 2145, 93, 327, 26399, 3158, 7501, 26400, 8215′ .

Can you help me please.

Reply
- Jason Brownlee December 29, 2018 at 5:52 am #
  
  Perhaps your data contains a string?
  
  Reply
  - Imen Drs December 29, 2018 at 7:59 am #
    
    The data contains ” user, number_of_followers, list_of_followers, number_of_followee, list_of_followee, number_of_mentions, list_of_user_mentioned…”
    the values in the list are separated by commas.
    For example: “36 ; 3 ; 52,3,87 ; 5 ; 63,785,22,11,6 ; 0 ; “
    
    Reply
Somashekhar January 2, 2019 at 4:39 am #

Hi, Is there a solution posted for solving pima-indians-diabetes.csv for prediction using LSTM?

Reply
- Jason Brownlee January 2, 2019 at 6:42 am #
  
  No. LSTMs are for sequential data only, and the pima indians dataset is not a sequence prediction problem.
  
  Reply
Imen Drs January 4, 2019 at 9:56 pm #

Is there a way to use specific fields in the dataset instead of the entire uploaded dataset.
And thanks.

Reply
- Jason Brownlee January 5, 2019 at 6:56 am #
  
  Yes, fields are columns in the dataset matrix and you can remove those columns that you do not want to use as inputs to your model.
  
  Reply
Kahina January 5, 2019 at 12:43 am #

Thank you so much ! It’s helpful

Reply
- Jason Brownlee January 5, 2019 at 6:58 am #
  
  I’m happy to hear that it was helpful.
  
  Reply
Khemmarut January 12, 2019 at 11:35 pm #

Traceback (most recent call last):
File “C:/Users/Admin/PycharmProjects/NN/nnt.py”, line 119, in
rounded = [round(X[:1]) for x in predictions]
File “C:/Users/Admin/PycharmProjects/NN/nnt.py”, line 119, in
rounded = [round(X[:1]) for x in predictions]
TypeError: type numpy.ndarray doesn’t define __round__ method

Help me please

Thank you.

Reply
- Jason Brownlee January 13, 2019 at 5:41 am #
  
  Perhaps ensure that your libraries are up to date?
  
  This might help:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  Reply
Priti Pachpande January 31, 2019 at 2:50 am #

Hi Jason,
Thank you for the amazing tutorial. I am trying to build an autoencoder model in keras using backend tensorflow.
I need to use tensorflow(like tf.ifft,tf.fft) functions in the model. Can you guide me towards how can I do it? I tried using lambda layer but the accuracy decreases when I use it.

Also, I m using model.predict() function to check the values between the intermediate layers. Am I doing it right?

Also, can you guide me towards how to use reshape function in keras?

Thanks for your help

Reply
- Jason Brownlee January 31, 2019 at 5:36 am #
  
  Sorry, I don’t know about the functions you are using. Perhaps post on stackoverflow?
  
  Reply
Crawford January 31, 2019 at 9:34 pm #

Hi Jason,
Your tutorials are brilliant, thanks for putting all this together.
In this tutorial the result is either a 1 or 0, but what if you have data with more than two possible results, e.g. 0, 1, 2, or similar?
Can I do something with the code you have presented here, or is a whole other approach required?
I have somewhat achieved what I’m trying to do using your “first machine learning project” using a knn model, but I had to simplify my data by stripping out some variables. I believe there is value in these extra variables, so thought the neural network might be useful, but like I said I have three classifications not two.
Thanks.

Reply
- Jason Brownlee February 1, 2019 at 5:37 am #
  
  Yes, here is an example of a multi-class classification with a neural net:
  https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
  
  Reply
  - Crawford February 1, 2019 at 10:11 pm #
    
    Brilliant, thanks.
    
    Reply
Sergio February 1, 2019 at 10:18 am #

Hi, Im trying to construct a neural network using complex number as inputs, I followed your recommendatins but i get the following warning:
`
ComplexWarning: Casting complex values to real discards the imaginary part return array(a, dtype, copy=False, order=order)

The code run without problems, but the predictions is 25 % exact.

Is possible to use complex number in neural networks..?

Do u have some advices?

Reply
- Jason Brownlee February 1, 2019 at 11:06 am #
  
  I don’t think the Keras API supports complex numbers as input.
  
  Reply
  - Sergio February 1, 2019 at 2:17 pm #
    
    Do u have any suggestion to deal with complex numbers?
    
    Reply
    - Jason Brownlee February 2, 2019 at 6:06 am #
      
      Not off hand, sorry.
      
      Perhaps post to the Keras users group to see if anyone has tried this before:
      https://machinelearningmastery.com/get-help-with-keras/
      
      Reply
Arnab Kumar Mishra February 1, 2019 at 9:47 pm #

Hi Jason,

I am trying to run the code in the tutorial with some minor modifications, but I am facing a problem with the training.

The training loss and accuracy both are staying the same across epochs (Please take a look at the code snippet and the output below). This is for a different dataset, not the diabetes dataset.

I have tried to solve this problem using the suggestions given in https://stackoverflow.com/questions/37213388/keras-accuracy-does-not-change

But the problem is still there.

Can you please take a look at this and help me solve this problem? Thanks.

CODE and OUTPUT Snippets:

# create model
model = Sequential()
model.add(Dense(15, input_dim=9, activation=’relu’))
model.add(Dense(10, activation=’relu’))
model.add(Dense(5, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))

# compile model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

# Fit the model
model.fit(xTrain, yTrain, epochs=500, batch_size=10)

Epoch 1/200
81/81 [==============================] – 0s 177us/step – loss: -8.4632 – acc: 0.4691
Epoch 2/200
81/81 [==============================] – 0s 148us/step – loss: -8.4632 – acc: 0.4691
Epoch 3/200
81/81 [==============================] – 0s 95us/step – loss: -8.4632 – acc: 0.4691
Epoch 4/200
81/81 [==============================] – 0s 116us/step – loss: -8.4632 – acc: 0.4691
Epoch 5/200
81/81 [==============================] – 0s 106us/step – loss: -8.4632 – acc: 0.4691
Epoch 6/200
81/81 [==============================] – 0s 98us/step – loss: -8.4632 – acc: 0.4691
Epoch 7/200
81/81 [==============================] – 0s 145us/step – loss: -8.4632 – acc: 0.4691
Epoch 8/200
81/81 [==============================] – 0s 138us/step – loss: -8.4632 – acc: 0.4691
Epoch 9/200
81/81 [==============================] – 0s 105us/step – loss: -8.4632 – acc: 0.4691
Epoch 10/200
81/81 [==============================] – 0s 128us/step – loss: -8.4632 – acc: 0.4691
Epoch 11/200
81/81 [==============================] – 0s 129us/step – loss: -8.4632 – acc: 0.4691
Epoch 12/200
81/81 [==============================] – 0s 111us/step – loss: -8.4632 – acc: 0.4691
Epoch 13/200
81/81 [==============================] – 0s 106us/step – loss: -8.4632 – acc: 0.4691
Epoch 14/200
81/81 [==============================] – 0s 144us/step – loss: -8.4632 – acc: 0.4691
Epoch 15/200
81/81 [==============================] – 0s 106us/step – loss: -8.4632 – acc: 0.4691
Epoch 16/200
81/81 [==============================] – 0s 180us/step – loss: -8.4632 – acc: 0.4691
Epoch 17/200
81/81 [==============================] – 0s 125us/step – loss: -8.4632 – acc: 0.4691
Epoch 18/200
81/81 [==============================] – 0s 183us/step – loss: -8.4632 – acc: 0.4691
Epoch 19/200
81/81 [==============================] – 0s 149us/step – loss: -8.4632 – acc: 0.4691
Epoch 20/200
81/81 [==============================] – 0s 146us/step – loss: -8.4632 – acc: 0.4691
Epoch 21/200
81/81 [==============================] – 0s 206us/step – loss: -8.4632 – acc: 0.4691
Epoch 22/200
81/81 [==============================] – 0s 135us/step – loss: -8.4632 – acc: 0.4691
Epoch 23/200
81/81 [==============================] – 0s 116us/step – loss: -8.4632 – acc: 0.4691
Epoch 24/200
81/81 [==============================] – 0s 135us/step – loss: -8.4632 – acc: 0.4691
Epoch 25/200
81/81 [==============================] – 0s 121us/step – loss: -8.4632 – acc: 0.4691
Epoch 26/200
81/81 [==============================] – 0s 110us/step – loss: -8.4632 – acc: 0.4691
Epoch 27/200
81/81 [==============================] – 0s 104us/step – loss: -8.4632 – acc: 0.4691
Epoch 28/200
81/81 [==============================] – 0s 122us/step – loss: -8.4632 – acc: 0.4691
Epoch 29/200
81/81 [==============================] – 0s 117us/step – loss: -8.4632 – acc: 0.4691
Epoch 30/200
81/81 [==============================] – 0s 111us/step – loss: -8.4632 – acc: 0.4691
Epoch 31/200
81/81 [==============================] – 0s 123us/step – loss: -8.4632 – acc: 0.4691
Epoch 32/200
81/81 [==============================] – 0s 116us/step – loss: -8.4632 – acc: 0.4691
Epoch 33/200
81/81 [==============================] – 0s 120us/step – loss: -8.4632 – acc: 0.4691
Epoch 34/200
81/81 [==============================] – 0s 156us/step – loss: -8.4632 – acc: 0.4691
Epoch 35/200
81/81 [==============================] – 0s 131us/step – loss: -8.4632 – acc: 0.4691
Epoch 36/200
81/81 [==============================] – 0s 122us/step – loss: -8.4632 – acc: 0.4691
Epoch 37/200
81/81 [==============================] – 0s 110us/step – loss: -8.4632 – acc: 0.4691
Epoch 38/200
81/81 [==============================] – 0s 121us/step – loss: -8.4632 – acc: 0.4691
Epoch 39/200
81/81 [==============================] – 0s 123us/step – loss: -8.4632 – acc: 0.4691
Epoch 40/200
81/81 [==============================] – 0s 111us/step – loss: -8.4632 – acc: 0.4691
Epoch 41/200
81/81 [==============================] – 0s 115us/step – loss: -8.4632 – acc: 0.4691
Epoch 42/200
81/81 [==============================] – 0s 119us/step – loss: -8.4632 – acc: 0.4691
Epoch 43/200
81/81 [==============================] – 0s 115us/step – loss: -8.4632 – acc: 0.4691
Epoch 44/200
81/81 [==============================] – 0s 133us/step – loss: -8.4632 – acc: 0.4691
Epoch 45/200
81/81 [==============================] – 0s 114us/step – loss: -8.4632 – acc: 0.4691
Epoch 46/200
81/81 [==============================] – 0s 112us/step – loss: -8.4632 – acc: 0.4691
Epoch 47/200
81/81 [==============================] – 0s 143us/step – loss: -8.4632 – acc: 0.4691
Epoch 48/200
81/81 [==============================] – 0s 124us/step – loss: -8.4632 – acc: 0.4691
Epoch 49/200
81/81 [==============================] – 0s 129us/step – loss: -8.4632 – acc: 0.4691
Epoch 50/200

The same goes on for the rest of the epochs as well.

Reply
- Jason Brownlee February 2, 2019 at 6:14 am #
  
  I have some suggestions here that might help:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Nagesh February 4, 2019 at 1:50 am #

Hi Jason,

Can you please update me, whether we can plot a graph(epoch vs acc)?
If yes then how.

Reply
- Jason Brownlee February 4, 2019 at 5:49 am #
  
  I show how here:
  https://machinelearningmastery.com/display-deep-learning-model-training-history-in-keras/
  
  Reply
Nils February 5, 2019 at 1:28 am #

Great stuff, thanks!

I just wondered that in chapter 2 there is a description of the “init” parameter, but in all sources it was missing.
I added it like:

model.add(Dense(12, input_dim=8, init=’uniform’ ,activation=’relu’))

Then I got this warning:
pima_diabetes.py:25: UserWarning: Update your Dense call to the Keras 2 API: Dense(12, input_dim=8, activation="relu" , kernel_initializer="uniform")
model.add(Dense(12, input_dim=8, init=’uniform’ ,activation=’relu’))

Solution for me was to use the “kernel_initializer” instead:
model.add(Dense(12, input_dim=8, activation=”relu”, kernel_initializer=”uniform”))

Regarding the same line I got one question: Is it correct, that it adds one input layer with 8 neurons AND another hidden layer with 12 neurons?
So, would it result in the same ANN to do this?
model.add(Dense(8, input_dim=8, kernel_initializer=’uniform’))
model.add(Dense(8, activation=”relu”, kernel_initializer=’uniform’))

Reply
- Jason Brownlee February 5, 2019 at 8:29 am #
  
  Yes, perhaps your version of the book is out of date, email me to get the latest version?
  
  Yes, the definition of the first hidden layer also defines the input layer via an argument.
  
  Reply
Shuja February 8, 2019 at 12:00 am #

Hi Jason
I am getting the following error
(env) shuja@latitude:~$ python keras_test.py
Using TensorFlow backend.
Traceback (most recent call last):
File “keras_test.py”, line 8, in
dataset = numpy.loadtxt(“pima-indians-diabetes.csv”, delimiter=”,”)
File “/home/shuja/env/lib/python3.6/site-packages/numpy/lib/npyio.py”, line 955, in loadtxt
fh = np.lib._datasource.open(fname, ‘rt’, encoding=encoding)
File “/home/shuja/env/lib/python3.6/site-packages/numpy/lib/_datasource.py”, line 266, in open
return ds.open(path, mode, encoding=encoding, newline=newline)
File “/home/shuja/env/lib/python3.6/site-packages/numpy/lib/_datasource.py”, line 624, in open
raise IOError(“%s not found.” % path)
OSError: pima-indians-diabetes.csv not found.

Reply
- Jason Brownlee February 8, 2019 at 7:52 am #
  
  Looks like the dataset was not downloaded and place in the same directory as your script.
  
  Reply
Shubham February 12, 2019 at 4:55 am #

Hi, Jason

Thanks for the tutorial.
Do you have some good reference or an example where I can learn about setting up “Adversarial Neural Networks”.

Shubham

Reply
- Jason Brownlee February 12, 2019 at 8:08 am #
  
  Not at this stage, I hope to cover the topic in the future.
  
  Reply
Daniel March 13, 2019 at 8:14 am #

Hey Jason,

I’ve been reading your tutorials for a while now on a variety of ML topics, and I think that you write very cleanly and concisely. Thank you for making almost every topic I’ve encountered understandable.

However, one thing I have noticed is that the comment sections on your pages sometimes cover the bulk of the webpage. The first couple times I saw this site, I saw how tiny my scroll bar was and I assumed that the tutorial would be 15 pages long, only to find that your introductions were in fact “gentle” as promised and everything but the first sliver of the page were people’s responses and your responses back. I think it would be very useful if you could somehow condense the responses (maybe a “show responses” button?) to only show the actual content. Not only would everything look better, but I think it would also prevent people from initially thinking your blog was exceptionally long, like I did a few times.

Reply
- Jason Brownlee March 13, 2019 at 8:26 am #
  
  Great feedback, thanks Daniel. I’ll see if there are some good wordpress plugins for this.
  
  Reply
ismael March 22, 2019 at 5:22 am #

do not work why

Reply
- Jason Brownlee March 22, 2019 at 8:39 am #
  
  Sorry to hear that you’re having trouble, what is the problem exactly?
  
  Reply
Felix Daniel March 30, 2019 at 7:09 am #

Awesome work on machine learning… I was just thinking on how to start my journey into Machine Learning, I randomly searched for people in Machine Learning on LinkedIn that’s how I find myself here… I’m delighted to see this… Here is my final bus stop to start building up in ML. Thanks for accepting my connection on LinkedIn.

I have a project that am about to start but I don’t know how and the road Map. Please I need your detailed guideline.

Here is the topic

Human Activity Recognition System that Controls overweight in Children and Adults.

Reply
- Jason Brownlee March 31, 2019 at 9:22 am #
  
  Sounds like a great project, you can get started here:
  https://machinelearningmastery.com/start-here/#deep_learning_time_series
  
  Reply
Akshaya E April 13, 2019 at 11:38 pm #

can you please explain me why we use 12 neurons in the first layer ? 8 are inputs and are the rest 4 biases ?

Reply
- Jason Brownlee April 14, 2019 at 5:49 am #
  
  No, the 12 refers to the 12 nodes in the first hidden layer, not the input layer.
  
  The input layer is defined by a input_dim argument on the first hidden layer.
  
  I explain more here:
  https://machinelearningmastery.com/faq/single-faq/how-do-you-define-the-input-layer-in-keras
  
  Reply
  - Akshaya E April 14, 2019 at 8:09 pm #
    
    thank you for the immediate response. my doubt has been cleared.
    
    Reply
    - Jason Brownlee April 15, 2019 at 7:52 am #
      
      Happy to hear that.
      
      Reply
Abhiram April 19, 2019 at 11:50 pm #

hii Jason, above predictions are between 0 to 1,My labels are 1,1,1,2,2,2,3,3,3……..36,36,36.
Now i want to predict class 36 then what should i do??

Reply
- Jason Brownlee April 20, 2019 at 7:39 am #
  
  What problem are you having exactly?
  
  Reply
Akash April 22, 2019 at 12:56 am #

Hi Jason,

I am learning NLP and facing difficulties with understanding NLP with Deep Learning.
Please, can you help with converting the following N:N to N:1 model?
I want to change my vec_y from max_input_words_amount length to 1.
How should I define the layers and use LSTM or RNN or …?
Thank You.

x=df1[‘Question’].tolist()
y=df1[‘Answer’].tolist()

max_input_words_amount = 0
tok_x = []
for i in range(len(x)) :
tokenized_q = nltk.word_tokenize(re.sub(r”[^a-z0-9]+”, ” “, x[i].lower()))
max_input_words_amount = max(len(tokenized_q), max_input_words_amount)
tok_x.append(tokenized_q)

vec_x=[]
for sent in tok_x:
sentvec = [ft_cbow_model[w] for w in sent]
vec_x.append(sentvec)

vec_y=[]
for sent in y:
sentvec = [ft_cbow_model[sent]]
vec_y.append(sentvec)

for tok_sent in vec_x:
tok_sent[max_input_words_amount-1:]=[]
tok_sent.append(ft_cbow_model[‘_E_’])

for tok_sent in vec_x:
if len(tok_sent)<max_input_words_amount:
for i in range(max_input_words_amount-len(tok_sent)):
tok_sent.append(ft_cbow_model['_E_'])

for tok_sent in vec_y:
tok_sent[max_input_words_amount-1:]=[]
tok_sent.append(ft_cbow_model['_E_'])

for tok_sent in vec_y:
if len(tok_sent)<max_input_words_amount:
for i in range(max_input_words_amount-len(tok_sent)):
tok_sent.append(ft_cbow_model['_E_'])

vec_x=np.array(vec_x,dtype=np.float64)
vec_y=np.array(vec_y,dtype=np.float64)

x_train,x_test, y_train,y_test = train_test_split(vec_x, vec_y, test_size=0.2, random_state=1)

model=Sequential()
model.add(LSTM(output_dim=100,input_shape=x_train.shape[1:],return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.add(LSTM(output_dim=100,input_shape=x_train.shape[1:],return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.add(LSTM(output_dim=100,input_shape=x_train.shape[1:],return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.add(LSTM(output_dim=100,input_shape=x_train.shape[1:],return_sequences=False, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.compile(loss='cosine_proximity', optimizer='adam', metrics=['accuracy'])

model.fit(x_train, y_train, nb_epoch=100,validation_data=(x_test, y_test),verbose=0)

Reply
- Jason Brownlee April 22, 2019 at 6:25 am #
  
  I’m happy to answer questions, but I don’t have the capacity to review your code, sorry.
  
  Reply
Charlie April 22, 2019 at 8:41 am #

Jason – I think you are honestly the best teacher of these concepts on the web. Would you do a graph convolutions post? Maybe working through the concepts in Kipf and Welling 2016 GCN (https://arxiv.org/abs/1609.02907) paper, and/or (ideally) a worked example applying to a graph network problem in Keras, maybe using Spektral, the recent graph convolutions Keras library (https://github.com/danielegrattarola/spektral ) – would HUGELY appreciate it, and with the rise of graph ML eg per this DeepMind paper (https://arxiv.org/abs/1806.01261) I’m sure there will be lots of great applications and interest for people but there’s not much online that’s easy to follow. Thanks so much in hope.

Reply
- Jason Brownlee April 22, 2019 at 2:26 pm #
  
  Thanks.
  
  Thanks for the suggestion.
  
  Reply
Kuda April 23, 2019 at 10:01 pm #

Hi Jason

Thank you so much for your examples they are crystal clear. Do you have the implementation of RBF neural network in python?

Reply
- Jason Brownlee April 24, 2019 at 8:00 am #
  
  Not at this stage, sorry.
  
  Reply
Tom Cole April 25, 2019 at 5:18 am #

Do you have updated python code for this model on github? I’m enjoying working through the model but having some difference in the library loads required to do the data splitting and the model fitting steps.
Thanks

Reply
- Jason Brownlee April 25, 2019 at 8:26 am #
  
  What problem are you having exactly?
  
  Reply
Mridul April 26, 2019 at 3:20 pm #

Hi! Jeson Brownlee,
I try to implement the model in Jupyter notebook.
But when i try to run,an error message show me that “module ‘tensorflow’ has no attribute ‘get_default_graph'” for compiling model = Sequential().I have try lot to overcome it.But couldn’t solve it.
well you please help on this.

Reply
- Jason Brownlee April 27, 2019 at 6:27 am #
  
  I recommend running code from the command line and not from a notebook, here’s how:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-run-a-script-from-the-command-line
  
  Reply
Royal May 5, 2019 at 10:18 pm #

Hi Jason,
Super tutorials!

If I run Your First Neural Network once and then repeat several times (without resetting the seed, during the same python session) using only this code:

model.fit(X, Y, epochs=150, batch_size=10, verbose=0)
scores = model.evaluate(X, Y)
print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))

then I get on average a ca. 3% improvement in accuracy (range 77.85% – 83.07%). Apparently the initialization values are benefitting from the previous runs.
Does it make sense to use a model based on the best fit found after running several times? That would provide an almost 5% greater accuracy!
Or are we overfitting?

Reply
- Jason Brownlee May 6, 2019 at 6:48 am #
  
  Yes, see this post:
  https://machinelearningmastery.com/faq/single-faq/why-do-i-get-different-results-each-time-i-run-the-code
  
  Reply
Roger May 12, 2019 at 1:53 am #

(base) C:\Users\Roger\Documents\Python Scripts>python firstnn.py
Using Theano backend.
Traceback (most recent call last):
File “firstnn.py”, line 14, in
model.add(Dense(12, input_dim=8, activation=’relu’))
File “C:\Users\Roger\Anaconda3\lib\site-packages\keras\engine\sequential.py”, line 165, in add
layer(x)
File “C:\Users\Roger\Anaconda3\lib\site-packages\keras\engine\base_layer.py”, line 431, in __call__
self.build(unpack_singleton(input_shapes))
File “C:\Users\Roger\Anaconda3\lib\site-packages\keras\layers\core.py”, line 866, in build
constraint=self.kernel_constraint)
File “C:\Users\Roger\Anaconda3\lib\site-packages\keras\legacy\interfaces.py”, line 91, in wrapper
return func(*args, **kwargs)
File “C:\Users\Roger\Anaconda3\lib\site-packages\keras\engine\base_layer.py”, line 249, in add_weight
weight = K.variable(initializer(shape),
File “C:\Users\Roger\Anaconda3\lib\site-packages\keras\initializers.py”, line 218, in __call__
dtype=dtype, seed=self.seed)
File “C:\Users\Roger\Anaconda3\lib\site-packages\keras\backend\theano_backend.py”, line 2600, in random_uniform
return rng.uniform(shape, low=minval, high=maxval, dtype=dtype)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\sandbox\rng_mrg.py”, line 872, in uniform
rstates = self.get_substream_rstates(nstreams, dtype)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\configparser.py”, line 117, in res
return f(*args, **kwargs)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\sandbox\rng_mrg.py”, line 779, in get_substream_rstates
multMatVect(rval[0], A1p72, M1, A2p72, M2)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\sandbox\rng_mrg.py”, line 62, in multMatVect
[A_sym, s_sym, m_sym, A2_sym, s2_sym, m2_sym], o, profile=False)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\compile\function.py”, line 317, in function
output_keys=output_keys)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\compile\pfunc.py”, line 486, in pfunc
output_keys=output_keys)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\compile\function_module.py”, line 1841, in orig_function
fn = m.create(defaults)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\compile\function_module.py”, line 1715, in create
input_storage=input_storage_lists, storage_map=storage_map)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\link.py”, line 699, in make_thunk
storage_map=storage_map)[:3]
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\vm.py”, line 1091, in make_all
impl=impl))
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\op.py”, line 955, in make_thunk
no_recycling)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\op.py”, line 858, in make_c_thunk
output_storage=node_output_storage)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\cc.py”, line 1217, in make_thunk
keep_lock=keep_lock)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\cc.py”, line 1157, in __compile__
keep_lock=keep_lock)
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\cc.py”, line 1609, in cthunk_factory
key = self.cmodule_key()
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\cc.py”, line 1300, in cmodule_key
c_compiler=self.c_compiler(),
File “C:\Users\Roger\Anaconda3\lib\site-packages\theano\gof\cc.py”, line 1379, in cmodule_key_
np.core.multiarray._get_ndarray_c_version())
AttributeError: (‘The following error happened while compiling the node’, DotModulo(A, s, m, A2, s2, m2), ‘\n’, “module ‘numpy.core.multiarray’ has no attribute ‘_get_ndarray_c_version'”)

Reply
- Roger May 12, 2019 at 1:58 am #
  
  I followed all the steps to set up the environment but when I ran the code I got an attribute error ‘module ‘numpy.core.multiarray’ has no attribute ‘_get_ndarray_c_version”
  
  Reply
  - Jason Brownlee May 12, 2019 at 6:45 am #
    
    Perhaps try searching/posting on stackoverflow?
    
    Reply
- Jason Brownlee May 12, 2019 at 6:45 am #
  
  Ouch, perhaps numpy is not installed correctly?
  
  Reply
Roger May 12, 2019 at 8:34 pm #

No numpy 1.16.2 does not work with theano 1.0.3 as served up currently by Anaconda. I downgraded to numpy 1.13.0.

Reply
- Jason Brownlee May 13, 2019 at 6:46 am #
  
  Thanks Roger.
  
  Reply
Aditya May 21, 2019 at 5:02 pm #

Hi Jason,
Thanks for this amazing example!
What I observe in the example is the database used is purely numeric.
My doubt is:
How can the example be modified to handle categorical input?
Will it work if the inputs are One Hot Encoded?

Reply
- Jason Brownlee May 22, 2019 at 7:38 am #
  
  Yes, you can use a one hot encoding for our input categorical variables.
  
  Reply
  - Aditya May 31, 2019 at 3:41 pm #
    
    Can you please provide a good reference point for OHE in python?
    Thanks in advance! 🙂
    
    Reply
    - Jason Brownlee June 1, 2019 at 6:09 am #
      
      Sure:
      https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/
      
      Reply
      - Aditya June 2, 2019 at 3:36 am #
        
        I read the link and it was helpful. Now, I have a doubt specific to my network.
        I have 3 categorical input which have different sizes. One has around 15 ‘categories’ while the other two have 5. So after I One Hot encode each of them, do I have to make their sizes same by padding? Or it’ll work as it it?
      - Jason Brownlee June 2, 2019 at 6:42 am #
        
        You can encode each variable and concatenate them together into one vector.
        
        Or you can have a model with one input for each variable and let the model concatenate them.
Sri June 17, 2019 at 7:29 pm #

Hi,

If there is one independent variable (say country) with more than 100 labels, how to resolve it.
I think only one hot encoding will not work including scaling.

Is there any alternative for it

Reply
- Jason Brownlee June 18, 2019 at 6:37 am #
  
  You can try:
  
  – integer encoding
  – one hot encoding
  – embedding
  
  Test each and see what works best for your specific dataset.
  
  Reply
MK June 21, 2019 at 7:05 pm #

Hi jason,

thanks a lot for your posts, helped me a lot.

1. How can I add confusion matrix?

2. How can I change learning rate?

Cheers Martin

Reply
- Jason Brownlee June 22, 2019 at 6:35 am #
  
  Add a confusion matrix:
  https://machinelearningmastery.com/custom-metrics-deep-learning-keras-python/
  
  Tune learning rate:
  https://machinelearningmastery.com/understand-the-dynamics-of-learning-rate-on-deep-learning-neural-networks/
  
  Reply
Guhan palanivel July 1, 2019 at 10:35 pm #

hi jason,
I have trained a neural network model with 6 months data and deployed at a remote site ,
when receiving the new data for upcoming months ,
is there any way to automatically update the model with addition of new training data ?

Reply
- Jason Brownlee July 2, 2019 at 7:31 am #
  
  Yes, perhaps the easiest way is to refit the model on the new data or on all available data.
  
  Reply
Shubham July 5, 2019 at 8:46 pm #

Hi jason,

I want to print the neural network score as a function of one of the variable., how do i do that?

Regards
Shubham

Reply
- Jason Brownlee July 6, 2019 at 8:35 am #
  
  Perhaps try a linear activation unit and a mse loss function?
  
  Reply
Maha Lakshmi July 17, 2019 at 7:37 pm #

Sir, I am working with sklearn.neural_network.MLPClassifier in Python. now I want to give my own Initial Weights to Classifier.how to do that? please help me. Thanks in Advance

Reply
- Jason Brownlee July 18, 2019 at 8:25 am #
  
  Sorry, I don’t have an example of this.
  
  Perhaps try posting on stackoverflow?
  
  Reply
Maha Lakshmi July 18, 2019 at 4:09 pm #

Thank you for your response

Reply
Ron July 24, 2019 at 8:39 am #

Normalization of the data increases the accuracy in the 90’s.
https://stackoverflow.com/questions/39525358/neural-network-accuracy-optimization

Reply
- Jason Brownlee July 24, 2019 at 2:19 pm #
  
  Thanks for sharing.
  
  Reply
Hammad July 29, 2019 at 6:12 pm #

Dear sir,

I would like to apply above shared example on arrays produced by “train_test_split” but it does not work, as these arrays are not in the form of numpy.

Let me give you the details, I have “XYZ” dataset. The dataset has the following specifications:

Total Images = 630
2500 features has been extracted from each image. Each feature has float type.
Total Classes = 7

Now, after processing the feature file, I have got results in the following variables:

XData: contains features data in two dimensional array form (rows: 630, columns: 2500)
YData: contain original labels of classes in one dimensional array form (rows: 630, column: 1)

So, by using the following code, I split the data set into train and testing data:

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(XData, YData, stratify=YData, test_size=0.25)

Now, I would like to apply the deep-learning examples shared on this blog on my dataset which is now in the form arrays, and generate output as prediction of testing data and accuracy.

Can you please let me know about it, which can work on the above arrays?

Reply
- Jason Brownlee July 30, 2019 at 6:05 am #
  
  Yes, the Keras model can operate on numpy arrays directly.
  
  Perhaps I don’t follow the problem that you’re having exactly?
  
  Reply
  - Hammad July 30, 2019 at 6:01 pm #
    
    Dear sir,
    
    Thanks, I converted my arrays into numpy format.
    
    Now, I have followed your tutorial on multi-classification problem (https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/) and use the following code:
    
    ############################################################
    import pandas
    from keras.models import Sequential
    from keras.layers import Dense
    from keras.wrappers.scikit_learn import KerasClassifier
    from keras.utils import np_utils
    from sklearn.model_selection import cross_val_score
    from sklearn.model_selection import KFold
    from sklearn.preprocessing import LabelEncoder
    from sklearn.pipeline import Pipeline
    from sklearn.metrics import accuracy_score
    
    seed=5
    totalclasses=7 # Class Labels are: ‘p1’, ‘p2’, ‘p3’, ‘p4’, ‘p5’, ‘p6’, ‘p7′
    totalimages=630
    totalfeatures=2500 #features generated from images
    
    # Data has been imported from feature file, which results two arrays XData and YData
    # XData contains features dataset without numpy array form
    # YData contains labels without numpy array form
    
    # encode class values as integers
    encoder = LabelEncoder()
    encoder.fit(YData)
    encoded_Y = encoder.transform(YData)
    # convert integers to dummy variables (i.e. one hot encoded)
    dummy_y = np_utils.to_categorical(encoded_Y)
    
    # define baseline model
    def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(8, input_dim=totalfeatures+1, activation=’relu’))
    model.add(Dense(totalclasses, activation=’softmax’))
    # Compile model
    model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=
    ‘accuracy’])
    return model
    
    estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=200, batch_size=5, verbose=0)
    
    x_train, x_test, y_train, y_test = train_test_split(XData, dummy_y, test_size=0.25, random_state=seed)
    
    x_train = np.array(x_train)
    x_test = np.array(x_test)
    y_train = np.array(y_train)
    y_test = np.array(y_test)
    
    estimator.fit(x_train, y_train)
    predictions = estimator.predict(x_test)
    
    print(predictions)
    print(encoder.inverse_transform(predictions))
    
    ########################################################
    
    The code generates no syntax error.
    
    Now, I would like to ask:
    
    1. Does I have applied the deep learning (Neural Network Model) in a right way?
    2. How could I calculate the accuracy, confusion matrix, and classification_report?
    3. Can you please suggest what other type of deep learning algorithms could I apply on this type of problem?
    
    After applying different deep learning algorithm, I would like to compare their accuracies such as, you did in tutorial https://machinelearningmastery.com/machine-learning-in-python-step-by-step/, by plotting graphs.
    
    Reply
    - Jason Brownlee July 31, 2019 at 6:46 am #
      
      Sorry, I don’t have the capacity to review your code.
      
      This post shows how to calculate metrics:
      https://machinelearningmastery.com/how-to-calculate-precision-recall-f1-and-more-for-deep-learning-models/
      
      I recommend testing a suite of methods in order to discover what works best for your specific dataset:
      https://machinelearningmastery.com/faq/single-faq/what-algorithm-config-should-i-use
      
      Reply
Tyson September 3, 2019 at 10:07 pm #

Hi Jason,
Great tutorial. I am now trying new data sets from the UCI archive. However I am running into problems when the data is incomplete. Rather than a number there is a ‘?’ indicating that the data is missing or unknown. So I am getting
ValueError: could not convert string to float: ‘?’

Is there a way to ignore that data? I am sure many data sets have this issue where pieces are missing.

Thanks in advance!

Reply
- Jason Brownlee September 4, 2019 at 5:58 am #
  
  Yes, you can replace missing data with the mean or median of the variable – at least as a starting point.
  
  Reply
Srinu September 10, 2019 at 9:07 pm #

Can you provide GUI code for the same data like calling the ANN model from a website or from android application.

Reply
- Jason Brownlee September 11, 2019 at 5:33 am #
  
  I don’t see why not.
  
  Reply
Hemanth Kumar September 20, 2019 at 12:58 pm #

dear sir
ValueError: Error when checking input: expected conv2d_5_input to have 4 dimensions, but got array with shape (250, 250, 3)
I am getting this error

what steps I did
original_image->resized to same resolution->converted to numpy array ->saved and loaded to x_train -> fed into network model ->modal.fit(x_train .. getting this error

Reply
- Jason Brownlee September 20, 2019 at 1:42 pm #
  
  Perhaps start with this tutorial for image classification:
  https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-network-to-classify-photos-of-dogs-and-cats/
  
  Reply
Hemanth Kumar September 20, 2019 at 3:14 pm #

thanks for response sir 🙂
after that I am getting list index out of range error at model.fit

Reply
- Jason Brownlee September 21, 2019 at 6:43 am #
  
  I’m sorry to hear that, I have some suggestions here that may help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply

Anthony The Koala September 26, 2019 at 2:58 am #

Dear Dr Jason,
Thank you for this tutorial.
I have been playing around with the number of layers and the number of neurons.
In the current code

model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model = Sequential()

model.add(Dense(12, input_dim=8, activation='relu'))

model.add(Dense(8, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

I have played around with increasing the numbers in the first layer:

model = Sequential()
model.add(Dense(100, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model = Sequential()

model.add(Dense(100, input_dim=8, activation='relu'))

model.add(Dense(8, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

The result is that the accuracy didn’t improve much.
There was an improvement in the addition of layers.
When each layer had say a large number of neurons, the accuracy improved.
This is not the only example, but playing around with the following code:

model = Sequential()
model.add(Dense(200, input_dim=8, activation='relu'))
model.add(Dense(800, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(400, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model = Sequential()

model.add(Dense(200, input_dim=8, activation='relu'))

model.add(Dense(800, activation='relu'))

model.add(Dense(200, activation='relu'))

model.add(Dense(400, activation='relu'))

model.add(Dense(200, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

The accuracy achieved was 91.1%

I added two more layers

model = Sequential()
model.add(Dense(200, input_dim=8, activation='relu'))
model.add(Dense(800, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(400, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(400, activation='relu'))
model.add(Dense(800, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model = Sequential()

model.add(Dense(200, input_dim=8, activation='relu'))

model.add(Dense(800, activation='relu'))

model.add(Dense(200, activation='relu'))

model.add(Dense(400, activation='relu'))

model.add(Dense(200, activation='relu'))

model.add(Dense(400, activation='relu'))

model.add(Dense(800, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

The accuracy dropped slightly to 88%

From these brief experiments, increasing the number of neurons as in your first example did not increase accuracy.
However adding more layers especially with a large number of neurons did increase the accuracy to about 91%
BUT if there are too many layers there is a slight drop in accuracy to 88%.

My question is there a way to increase the accuracy any further than 91%?

Thank you,
Anthony of Sydney

Jason Brownlee September 26, 2019 at 6:45 am #

If this is the pima indians dataset, then the best accuracy is about 78% via 10-fold cross validation, anything more is probably overfitting.

Yes, I have tons of tutorials on diagnosing issues with models and lifting performance, you can start here:
https://machinelearningmastery.com/start-here/#better

Reply

Anthony The Koala September 26, 2019 at 6:05 am #

Dear Dr Jason,
Further experimentation, I played with the following code

model = Sequential()
model.add(Dense(25, input_dim=8, activation='relu'))
model.add(Dense(89, activation='relu'))
model.add(Dense(377, activation='relu'))
model.add(Dense(233, activation='relu'))
model.add(Dense(55, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model = Sequential()

model.add(Dense(25, input_dim=8, activation='relu'))

model.add(Dense(89, activation='relu'))

model.add(Dense(377, activation='relu'))

model.add(Dense(233, activation='relu'))

model.add(Dense(55, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

I obtained an accuracy of 95% by playing around with the number of neurons increasing then decreasing.
I cannot work out a systematic way of improving the accuracy.

Thank you,
Anthony of Sydney

Jason Brownlee September 26, 2019 at 6:46 am #

Haha, yes. That is the great open problem with neural nets (no good theories for how to configure them) and why we must use empirical methods.

Reply

Anthony The Koala September 26, 2019 at 1:57 pm #

Dear Dr Jason,
thank you for those replies.

Yes, it was the Pima Indian dataset that is covered in this tutorial.

Before I indulge in further readings on 10-fold cross validation, please briefly answer:
* what is the meaning of overfit.
* why is an accuracy of 96% regarded as overfit.

To do:
Play around with simple functions and play around with this tutorial and then look at overfitting:
For example suppose we have x = 0, 1, 2, 3, 4, 5 and f(x) = x^2

x : 0, 1, 2 , 3, 4, 5 f(x) : 0, 1, 4, 9, 16, 25

1
2

x : 0, 1, 2 , 3, 4, 5
f(x) : 0, 1, 4, 9, 16, 25

The aim:
* to see if there is an accurate mapping of the function of x and f(x) for x = 0..5
* to see what happens when we predict for x = 6, 7, 8. Will it be 36, 49, 64?
* we ask if there is such a thing as overfitting the model exists.

Thank you,
Anthony of Sydney

Reply
- Jason Brownlee September 27, 2019 at 7:43 am #
  
  Overfit means better performance on the training set at the cost of performing worse on the test set.
  
  It can also mean better performance on a test/validation set at the cost of worse performance on new data.
  
  I know from experience that the limit on that dataset is 77-78% after having worked with it in tutorials for about 20 years.
  
  Reply
Andrey September 29, 2019 at 8:32 pm #

Hi Jason,

I see the data is not divided for that of training and for the test. Why is that? What does prediction mean in this case?

Andrey

Reply
- Jason Brownlee September 30, 2019 at 6:07 am #
  
  It might mean that the result is a little optimistic.
  
  I did that to keep this example very simple and easy to follow.
  
  Reply

Anthony The Koala September 29, 2019 at 9:00 pm #

Dear Dr Jason,
I tried to do the same for a deterministic model of x and fx where x = [0,1,2,3,4,5] and fx = x**2
I want to see how machine learning operates with a deterministic function.
However I am only getting 16.67% accuracy.
Here is the code based on the this tutorial

from keras.models import Sequential
from keras.layers import Dense
import numpy as np

#Aim is to see how a deterministic function will operate using machine learning
#In year 7 algebra we have x and y. y is known as f(x). 

#So here we aim to have a structore of [indep var, dep var]
#that is [x, fx]

#Making a 2D (like) list
x = [i for i in range(6)]; # have a list of x = [0,1,2,3,4,5]
#x = np.array(x) 

fx = [x**2 for x in x]; # have a list of fx = [0,1,4,9,16,25]
#fx = np.array(fx) 

model = Sequential()
model.add(Dense(100, input_dim=1,activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(1, activation='relu'))


model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x,fx,epochs=150, batch_size=2,verbose=0)
_,accuracy = model.evaluate(x,fx)
print('Accuracy: %.2f' % (accuracy*100))

from keras.models import Sequential

from keras.layers import Dense

import numpy as np

#Aim is to see how a deterministic function will operate using machine learning

#In year 7 algebra we have x and y. y is known as f(x).

#So here we aim to have a structore of [indep var, dep var]

#that is [x, fx]

#Making a 2D (like) list

x = [i for i in range(6)]; # have a list of x = [0,1,2,3,4,5]

#x = np.array(x)

fx = [x**2 for x in x]; # have a list of fx = [0,1,4,9,16,25]

#fx = np.array(fx)

model = Sequential()

model.add(Dense(100, input_dim=1,activation='relu'))

model.add(Dense(200, activation='relu'))

model.add(Dense(1, activation='relu'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(x,fx,epochs=150, batch_size=2,verbose=0)

_,accuracy = model.evaluate(x,fx)

print('Accuracy: %.2f' % (accuracy*100))

We know that fx = x**2 is predictable. What do I need to do.

Thank you,
Anthony of Sydney

Jason Brownlee September 30, 2019 at 6:10 am #

Perhaps you need hundreds of thousands of examples?

And perhaps the model will need to be tuned for your problem, e.g. perhaps using mse loss and a linear activation function in the output layer because it is a regression problem.

Reply

Anthony The Koala October 1, 2019 at 5:15 am #

Dear Dr Jason,
I tried with mse-loss and linear activation function and still only obtained 1% accuracy.

from keras.models import Sequential
from keras.layers import Dense
import numpy as np

#Aim is to see how a deterministic function will operate using machine learning
#In year 7 algebra we have x and y. y is known as f(x). 

x = [i for i in range(100)]; # have a list of x = [0,1,2,3,4,5]
x = np.array(x) 

fx = [x**2 for x in x]; # have a list of fx = [0,1,4,9,16,25]
fx = np.array(fx) 

model = Sequential()
model.add(Dense(12, input_dim=1,activation='linear'))
model.add(Dense(33, activation='linear')) 
model.add(Dense(1, activation='linear')) 


#model.compile(loss='mean_squared_error', optimizer='softmax', metrics=['accuracy'])
model.compile(loss='mean_squared_error', optimizer='sgd')
#model.compile(loss='mean_squared_error')

model.fit(x,fx,epochs=10, batch_size=1,verbose=0)
_,accuracy = model.evaluate(x,fx)
print('Accuracy: %.2f' % (accuracy*100))

from keras.models import Sequential

from keras.layers import Dense

import numpy as np

#Aim is to see how a deterministic function will operate using machine learning

#In year 7 algebra we have x and y. y is known as f(x).

x = [i for i in range(100)]; # have a list of x = [0,1,2,3,4,5]

x = np.array(x)

fx = [x**2 for x in x]; # have a list of fx = [0,1,4,9,16,25]

fx = np.array(fx)

model = Sequential()

model.add(Dense(12, input_dim=1,activation='linear'))

model.add(Dense(33, activation='linear'))

model.add(Dense(1, activation='linear'))

#model.compile(loss='mean_squared_error', optimizer='softmax', metrics=['accuracy'])

model.compile(loss='mean_squared_error', optimizer='sgd')

#model.compile(loss='mean_squared_error')

model.fit(x,fx,epochs=10, batch_size=1,verbose=0)

_,accuracy = model.evaluate(x,fx)

print('Accuracy: %.2f' % (accuracy*100))

However I get this:

 32/100 [========>.....................] - ETA: 0s
100/100 [==============================] - 0s 312us/step
Traceback (most recent call last):
  File "C:\Python36\deterministicII.py", line 25, in 
    _,accuracy = model.evaluate(x,fx)
TypeError: 'float' object is not iterable

32/100 [========>.....................] - ETA: 0s

100/100 [==============================] - 0s 312us/step

Traceback (most recent call last):

File "C:\Python36\deterministicII.py", line 25, in

_,accuracy = model.evaluate(x,fx)

TypeError: 'float' object is not iterable

I want to map a deterministic function to see if machine learning will work out f(x) without the formula.

Jason Brownlee October 1, 2019 at 7:00 am #

Accuracy is not a valid metric for regression problems:
https://machinelearningmastery.com/faq/single-faq/how-do-i-calculate-accuracy-for-regression

You are very close!

Also, try a much larger dataset of examples. Hundreds or thousands.

Reply

Anthony The Koala October 1, 2019 at 10:18 am #

Dear Dr Jason,
I removed the model.evaluate from the program. BUT still I have not got a satisfactory match of the expected and actual values.

from keras.models import Sequential
from keras.layers import Dense
import numpy as np

#Aim is to see how a deterministic function will operate using machine learning
#In year 7 algebra we have x and y. y is known as f(x). 

x = [i  for i in range(100)]; # have a list of x = [0,1,2,3,4,5]
x = np.array(x)  

fx = [x**2  for x in x]; # have a list of fx = [0,1,4,9,16,25]
fx = np.array(fx) 

model = Sequential()
model.add(Dense(100, input_dim=1,activation='linear'))
model.add(Dense(100, activation='linear')) 
model.add(Dense(1, activation='linear')) 
model.compile(loss='mean_squared_error', optimizer='adam')

model.fit(x,fx,epochs=1000, batch_size=1000,verbose=0)

#Removing the model.evaluate code

predictions = model.predict_classes(x)
for i in range(6):
    print('%s => %d (expected %d)' % (x[i],predictions[i],fx[i]))

from keras.models import Sequential

from keras.layers import Dense

import numpy as np

#Aim is to see how a deterministic function will operate using machine learning

#In year 7 algebra we have x and y. y is known as f(x).

x = [i for i in range(100)]; # have a list of x = [0,1,2,3,4,5]

x = np.array(x)

fx = [x**2 for x in x]; # have a list of fx = [0,1,4,9,16,25]

fx = np.array(fx)

model = Sequential()

model.add(Dense(100, input_dim=1,activation='linear'))

model.add(Dense(100, activation='linear'))

model.add(Dense(1, activation='linear'))

model.compile(loss='mean_squared_error', optimizer='adam')

model.fit(x,fx,epochs=1000, batch_size=1000,verbose=0)

#Removing the model.evaluate code

predictions = model.predict_classes(x)

for i in range(6):

print('%s => %d (expected %d)' % (x[i],predictions[i],fx[i]))

Output

0 => 0 (expected 0)
1 => 0 (expected 1)
2 => 0 (expected 4)
3 => 0 (expected 9)
4 => 1 (expected 16)
5 => 1 (expected 25)

0 => 0 (expected 0)

1 => 0 (expected 1)

2 => 0 (expected 4)

3 => 0 (expected 9)

4 => 1 (expected 16)

5 => 1 (expected 25)

Not yet getting a match of the expected and the actual values

Thank you,
Anthony of Sydney

Jason Brownlee October 1, 2019 at 2:17 pm #

Perhaps the model architecture (layers and nodes) needs tuning?
Perhaps the learning rate needs tuning?
Perhaps you need more training examples?
Perhaps you need more or fewer epochs?
…

More ideas here:
https://machinelearningmastery.com/start-here/#better

Reply

Anthony The Koala October 2, 2019 at 7:41 am #

Dear Dr Jason,
I cannot find a systematic way to find a way for a machine learning algorithm to use it to compute a deterministic equation such as y = f(x) where f(x) = x**2.

I am still having trouble. I will be posting this on the page. Essentially is (i) adding/dropping layers, (ii) adjusting the number of epochs, (iii) adjusting the batch_size. But I haven’t come close yet.

Also using the function model.predict rather than model.predict_classes.

Here is the program with most of the commented out lines deleted.

from keras.models import Sequential
from keras.layers import Dense
import numpy as np

#Aim is to see how a deterministic function will operate using machine learning
#In year 7 algebra we have x and y. y is known as f(x). Here y = f(x) = x**2 

x = [i  for i in range(100)]; # have a list of x = [0,1,2,3,4,5,.....,99]
x = np.array(x)   

fx = [x**2  for x in x]; # have a list of fx = x**2 = [0,1,4,9,16,25,...,9801] 
fx = np.array(fx) 

model = Sequential()
model.add(Dense(55, input_dim=1,activation='linear'))
model.add(Dense(34, activation='linear')) 
model.add(Dense(21, activation='linear'))
model.add(Dense(13, activation='linear'))
model.add(Dense(1, activation='linear')) 

model.compile(loss='mean_squared_error', optimizer='adam')

model.fit(x,fx,epochs=89, batch_size=144,verbose=0) 

predictions = model.predict(x); #This seems to work instead of model.predict_classes

print("x, predicted, expected")
for i in range(6):
    print('%s => %d (expected %d)' % (x[i],predictions[i],fx[i]))

from keras.models import Sequential

from keras.layers import Dense

import numpy as np

#Aim is to see how a deterministic function will operate using machine learning

#In year 7 algebra we have x and y. y is known as f(x). Here y = f(x) = x**2

x = [i for i in range(100)]; # have a list of x = [0,1,2,3,4,5,.....,99]

x = np.array(x)

fx = [x**2 for x in x]; # have a list of fx = x**2 = [0,1,4,9,16,25,...,9801]

fx = np.array(fx)

model = Sequential()

model.add(Dense(55, input_dim=1,activation='linear'))

model.add(Dense(34, activation='linear'))

model.add(Dense(21, activation='linear'))

model.add(Dense(13, activation='linear'))

model.add(Dense(1, activation='linear'))

model.compile(loss='mean_squared_error', optimizer='adam')

model.fit(x,fx,epochs=89, batch_size=144,verbose=0)

predictions = model.predict(x); #This seems to work instead of model.predict_classes

print("x, predicted, expected")

for i in range(6):

print('%s => %d (expected %d)' % (x[i],predictions[i],fx[i]))

The output is:

x, predicted, expected
0 => 29 (expected 0)
1 => 110 (expected 1)
2 => 191 (expected 4)
3 => 272 (expected 9)
4 => 353 (expected 16)
5 => 434 (expected 25)

x, predicted, expected

0 => 29 (expected 0)

1 => 110 (expected 1)

2 => 191 (expected 4)

3 => 272 (expected 9)

4 => 353 (expected 16)

5 => 434 (expected 25)

No matter how much I adjust the number of neurons per layer, the number of layers, the no of epochs and the batch size, the “predicted” appears like an arithmetic progression, not a geometric progression.

Note the terms tn+1 – tn is 81 for all the predicted values in the machine learning model.

BUT we know that the difference between successive terms in y = f(x) is not the same.

For example, in non linear relation such as f(x) = x**2, f(x) = 0, 1, 2, 4, 9, 16, 25, 36, the difference between the terms is: 1, 1, 2, 5, 7, 9, 11, that is tn+1 – tn != tn+2 – tn+1.

So still having trouble working out how to get a machine learning algorithm evaluate f(x) without the formula.

Jason Brownlee October 2, 2019 at 8:15 am #

Here is the solution, hope it helps

# fit an mlp on x vs x^2
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from numpy import asarray
from matplotlib import pyplot
# define data
x = asarray([i for i in range(1000)])
y = asarray([a**2 for a in x])
# reshape into rows and cols
x = x.reshape((len(x), 1))
y = y.reshape((len(y), 1))
# scale data
x_s = MinMaxScaler()
x = x_s.fit_transform(x)
y_s = MinMaxScaler()
y = y_s.fit_transform(y)
# fit a model
model = Sequential()
model.add(Dense(10, input_dim=1, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(x, y, epochs=150, batch_size=10, verbose=0)
mse = model.evaluate(x, y, verbose=0)
print(mse)
# predict
yhat = model.predict(x)
# plot real vs predicted
pyplot.plot(x,y,label='y')
pyplot.plot(x,yhat,label='yhat')
pyplot.legend()
pyplot.show()

# fit an mlp on x vs x^2

from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential

from keras.layers import Dense

from numpy import asarray

from matplotlib import pyplot

# define data

x = asarray([i for i in range(1000)])

y = asarray([a**2 for a in x])

# reshape into rows and cols

x = x.reshape((len(x), 1))

y = y.reshape((len(y), 1))

# scale data

x_s = MinMaxScaler()

x = x_s.fit_transform(x)

y_s = MinMaxScaler()

y = y_s.fit_transform(y)

# fit a model

model = Sequential()

model.add(Dense(10, input_dim=1, activation='relu'))

model.add(Dense(1))

model.compile(loss='mse', optimizer='adam')

model.fit(x, y, epochs=150, batch_size=10, verbose=0)

mse = model.evaluate(x, y, verbose=0)

print(mse)

# predict

yhat = model.predict(x)

# plot real vs predicted

pyplot.plot(x,y,label='y')

pyplot.plot(x,yhat,label='yhat')

pyplot.legend()

pyplot.show()

I guess you could also do an inverse_transform() on the predicted values to get back to original units.

Anthony The Koala October 2, 2019 at 9:05 am #

Dear Dr Jason,
Thank you very much for your reply. I got an mse in the order of 3 x 10**-6.

Despite this, I will be studying the program and learn myself about (i) the MinMaxScaler and why we use it, (ii) fit_transform(y) and (iii) one hidden layer of 10 neurons, and (iii) I will still have to learn about the choice of activation function and loss functions. The keras website has a section on loss functions at https://keras.io/losses/ but having a look at the Python “IDLE” program, a look at from keras import losses, there are many more loss functions which are necessary to compile a model.

In addition, the predicted values will have to be re-computed to its unscaled values. So I will also look up ‘rescaling’.

Thank you again,
Anthony, Sydney NSW

Reply
- Jason Brownlee October 2, 2019 at 10:10 am #
  
  Yes, you can use inverse_transform to unscale the predictions, as I mentioned.
  
  Reply

Anthony The Koala October 3, 2019 at 6:26 am #

Dear Dr Jason,
I know how to use the inverse_transform function:
First apply the MinMaxScaler to scale to 0 to 1

x_s = MinMaxScaler()
x = x_s.fit_transform(x)
y_s = MinMaxScaler()
y = y_s.fit_transform(y)

x_s = MinMaxScaler()

x = x_s.fit_transform(x)

y_s = MinMaxScaler()

y = y_s.fit_transform(y)

If we want to reconstitute x and y, it is simple to:

x_original = x_s.inverse_transform(x); # where x was transformed/scaled
y_original = y_s.inverse_transform(y);# where y was transformed/scaled

1 2	x_original = x_s.inverse_transform(x); # where x was transformed/scaled y_original = y_s.inverse_transform(y);# where y was transformed/scaled

x_s and y_s has the min and max values stored of the original pre-transformed data.

BUT how do you transform yhat to its original scale when it was not subject to the inverse_transform function.

If I relied on the y_s.inverse_transform(yhat), where you get this:

yhat_restored = y_s.inverse_transform(yhat); #using the values of ymin and ymax of original data
yhat_restored[0:10]
array([[6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43]], dtype=float32)

yhat_restored = y_s.inverse_transform(yhat); #using the values of ymin and ymax of original data

yhat_restored[0:10]

array([[6838.43],

[6838.43],

[6838.43]], dtype=float32)

I was ‘hoping’ for something close to the original:

>>> y_restored = y_s.inverse_transform(y)
>>> y_restored[0:10]
array([[ 0.],
       [ 1.],
       [ 4.],
       [ 9.],
       [16.],
       [25.],
       [36.],
       [49.],
       [64.],
       [81.]])

>>> y_restored = y_s.inverse_transform(y)

>>> y_restored[0:10]

array([[ 0.],

[ 1.],

[ 4.],

[ 9.],

[16.],

[25.],

[36.],

[49.],

[64.],

[81.]])

BUT yhat does not use the MinMaxScaler at the start.

Do I have to rewrite my own function?

Thanks,
Anthony of Sydney NSW

Jason Brownlee October 3, 2019 at 6:54 am #

The model predicts scaled values, apply the inverse transform on yhat directly.

Reply

Anthony The Koala October 3, 2019 at 2:39 pm #

Dear Dr Jason,
I did that apply the inverse transform of yhat directly, BUT GOT these
Cut down version of code

y_s = MinMaxScaler()
y = y_s.fit_transform(y); #y_s stores the min and max values according to the sklearn doc
                                      #Note the above is for y. WE DON'T KNOW yhat(min) & yhat(max


yhat = model.predict(x);  #we have the scaled estimate.


x_original = x_s.inverse_transform(x); #this printed okay

#Printout of yhat transformed
#Calculate yhat scaled using the min and max values of f(x) = y

yhat_restored = y_s.inverse_transform(yhat)

#Print yhat
print(yhat_restored[0:10])
array([[6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43],
       [6838.43]], dtype=float32)

y_s = MinMaxScaler()

y = y_s.fit_transform(y); #y_s stores the min and max values according to the sklearn doc

#Note the above is for y. WE DON'T KNOW yhat(min) & yhat(max

yhat = model.predict(x); #we have the scaled estimate.

x_original = x_s.inverse_transform(x); #this printed okay

#Printout of yhat transformed

#Calculate yhat scaled using the min and max values of f(x) = y

yhat_restored = y_s.inverse_transform(yhat)

#Print yhat

print(yhat_restored[0:10])

array([[6838.43],

[6838.43],

[6838.43]], dtype=float32)

Don’t understand how to get an inverse transform of yhat when I don’t know the ‘untransformed’ value because I have not estimated it.

Thank you,
Anthony of Sydney

Jason Brownlee October 4, 2019 at 5:39 am #

You can inverse transform y and yhat and plot both.

Reply

Anthony The Koala October 4, 2019 at 3:20 am #

Dear Dr Jason,
I tried it again to illustrate that despite the predicted fitting a parabola for scaled predicted and expected values of f(x) the resulting values when ‘unscaled’ back to the original does seems quite absurd.
Code – relevant

# plot real vs predicted
pyplot.plot(x,y,label='y')
pyplot.plot(x,yhat,label='yhat')
pyplot.legend()
print("The graph of the (x, predicted f(x) and (x, f(x) is on a separate window")
pyplot.show()
y_predicted = y_s.inverse_transform(yhat)
y_expected = y_s.inverse_transform(y)

x_original = x_s.inverse_transform(x)
#print(y_predicted[0:10,].tolist(), x_original[0:10,].tolist())
print("Printing the first 10, predicted, expected, and x")
for i in range(10):
    print(y_predicted[i], y_expected[i], x_original[i])


print("let's try some other arbitrary section, say 10:20")
#print(y_predicted[9:21,].tolist(),y_predicted[9:21,].tolist(), x_original[9:21,].tolist())
print("printing 10th to 20th, predicted, expected, and x")
for i in range(10):
    print(y_predicted[i+10],y_expected[i+10], x_original[i+10])

# plot real vs predicted

pyplot.plot(x,y,label='y')

pyplot.plot(x,yhat,label='yhat')

pyplot.legend()

print("The graph of the (x, predicted f(x) and (x, f(x) is on a separate window")

pyplot.show()

y_predicted = y_s.inverse_transform(yhat)

y_expected = y_s.inverse_transform(y)

x_original = x_s.inverse_transform(x)

#print(y_predicted[0:10,].tolist(), x_original[0:10,].tolist())

print("Printing the first 10, predicted, expected, and x")

for i in range(10):

print(y_predicted[i], y_expected[i], x_original[i])

print("let's try some other arbitrary section, say 10:20")

#print(y_predicted[9:21,].tolist(),y_predicted[9:21,].tolist(), x_original[9:21,].tolist())

print("printing 10th to 20th, predicted, expected, and x")

for i in range(10):

print(y_predicted[i+10],y_expected[i+10], x_original[i+10])

The resulting output:

5.472537487406726e-06
Printing the first 10, predicted, expected, and x
[1030.0833] [0.] [0.]
[1030.0833] [1.] [1.]
[1030.0833] [4.] [2.]
[1030.0833] [9.] [3.]
[1030.0833] [16.] [4.]
[1030.0833] [25.] [5.]
[1030.0833] [36.] [6.]
[1030.0833] [49.] [7.]
[1030.0833] [64.] [8.]
[1030.0833] [81.] [9.]
let's try some other arbitrary section, say 10:20
printing 10th to 20th, predicted, expected, and x
[1030.0833] [100.] [10.]
[1030.0833] [121.] [11.]
[1030.0833] [144.] [12.]
[1030.0833] [169.] [13.]
[1030.0833] [196.] [14.]
[1030.0833] [225.] [15.]
[1030.0833] [256.] [16.]
[1030.0833] [289.] [17.]
[1030.0833] [324.] [18.]
[1030.0833] [361.] [19.]

5.472537487406726e-06

Printing the first 10, predicted, expected, and x

[1030.0833] [0.] [0.]

[1030.0833] [1.] [1.]

[1030.0833] [4.] [2.]

[1030.0833] [9.] [3.]

[1030.0833] [16.] [4.]

[1030.0833] [25.] [5.]

[1030.0833] [36.] [6.]

[1030.0833] [49.] [7.]

[1030.0833] [64.] [8.]

[1030.0833] [81.] [9.]

let's try some other arbitrary section, say 10:20

printing 10th to 20th, predicted, expected, and x

[1030.0833] [100.] [10.]

[1030.0833] [121.] [11.]

[1030.0833] [144.] [12.]

[1030.0833] [169.] [13.]

[1030.0833] [196.] [14.]

[1030.0833] [225.] [15.]

[1030.0833] [256.] [16.]

[1030.0833] [289.] [17.]

[1030.0833] [324.] [18.]

[1030.0833] [361.] [19.]

When I plotted (x, yhat) and (x,f(x)), the plot was as expected. BUT when I rescaled the yhat back, all the values of unscaled yhat were 1030.0833 which is quite odd.

Why?

Thank you,
Anthony of Sydney NSW

Anthony The Koala October 4, 2019 at 3:31 am #

Dear Dr Jason,
I printed the yhat, and they were all the same.

This is despite that the plot of the scaled values (x, yhat) looked like a parabola
Note: this is prior to scaling.

# plot real vs predicted
pyplot.plot(x,y,label='y')
pyplot.plot(x,yhat,label='yhat')
pyplot.legend()
print("the graph is printed on another window")
pyplot.show()

print("Printing the output of the scaled values of yhat, f(x) and x")
print("printing the first 10")
for i in range(10):
    print(yhat[i],y[i],x[i])
print("printing the 10th to 20th")
for i in range(10):
    print(yhat[i+10],y[i+10],x[i+10])

# plot real vs predicted

pyplot.plot(x,y,label='y')

pyplot.plot(x,yhat,label='yhat')

pyplot.legend()

print("the graph is printed on another window")

pyplot.show()

print("Printing the output of the scaled values of yhat, f(x) and x")

print("printing the first 10")

for i in range(10):

print(yhat[i],y[i],x[i])

print("printing the 10th to 20th")

for i in range(10):

print(yhat[i+10],y[i+10],x[i+10])

Yet despite the expected plots of scaled values (x,yhat), and (x, y), yhat’s values are the same

Printing the output of the scaled values
printing the first 10
[0.00117336] [0.] [0.]
[0.00117336] [1.002003e-06] [0.001001]
[0.00117336] [4.00801202e-06] [0.002002]
[0.00117336] [9.01802704e-06] [0.003003]
[0.00117336] [1.60320481e-05] [0.004004]
[0.00117336] [2.50500751e-05] [0.00500501]
[0.00117336] [3.60721081e-05] [0.00600601]
[0.00117336] [4.90981472e-05] [0.00700701]
[0.00117336] [6.41281923e-05] [0.00800801]
[0.00117336] [8.11622433e-05] [0.00900901]
printing the 10th to 20th
[0.00117336] [0.0001002] [0.01001001]
[0.00117336] [0.00012124] [0.01101101]
[0.00117336] [0.00014429] [0.01201201]
[0.00117336] [0.00016934] [0.01301301]
[0.00117336] [0.00019639] [0.01401401]
[0.00117336] [0.00022545] [0.01501502]
[0.00117336] [0.00025651] [0.01601602]
[0.00117336] [0.00028958] [0.01701702]
[0.00117336] [0.00032465] [0.01801802]
[0.00117336] [0.00036172] [0.01901902]

Printing the output of the scaled values

printing the first 10

[0.00117336] [0.] [0.]

[0.00117336] [1.002003e-06] [0.001001]

[0.00117336] [4.00801202e-06] [0.002002]

[0.00117336] [9.01802704e-06] [0.003003]

[0.00117336] [1.60320481e-05] [0.004004]

[0.00117336] [2.50500751e-05] [0.00500501]

[0.00117336] [3.60721081e-05] [0.00600601]

[0.00117336] [4.90981472e-05] [0.00700701]

[0.00117336] [6.41281923e-05] [0.00800801]

[0.00117336] [8.11622433e-05] [0.00900901]

printing the 10th to 20th

[0.00117336] [0.0001002] [0.01001001]

[0.00117336] [0.00012124] [0.01101101]

[0.00117336] [0.00014429] [0.01201201]

[0.00117336] [0.00016934] [0.01301301]

[0.00117336] [0.00019639] [0.01401401]

[0.00117336] [0.00022545] [0.01501502]

[0.00117336] [0.00025651] [0.01601602]

[0.00117336] [0.00028958] [0.01701702]

[0.00117336] [0.00032465] [0.01801802]

[0.00117336] [0.00036172] [0.01901902]

I don’t get it.You would expect a similarity of yhat and f(x).

I would appreciate a response

Thank you,
Anthony of Sydney

Jason Brownlee October 4, 2019 at 5:49 am #

Sorry, I don’t have the capacity to debug your examples further. I hope that you can understand.

Reply

Anthony The Koala October 4, 2019 at 6:36 am #

Dear Dr Jason,
I asked the question at https://datascience.stackexchange.com/questions/61223/reconstituting-estimated-predicted-values-to-original-scale-from-minmaxscaler and hope that there is an answer.
Thanks
Anthony Of Sydney

Jason Brownlee October 4, 2019 at 8:35 am #

Here is the solution

# fit an mlp on x vs x^2
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from numpy import asarray
from matplotlib import pyplot
# define data
x = asarray([i for i in range(1000)])
y = asarray([a**2 for a in x])
# reshape into rows and cols
x = x.reshape((len(x), 1))
y = y.reshape((len(y), 1))
# scale data
x_s = MinMaxScaler()
x = x_s.fit_transform(x)
y_s = MinMaxScaler()
y = y_s.fit_transform(y)
# fit a model
model = Sequential()
model.add(Dense(10, input_dim=1, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(x, y, epochs=150, batch_size=10, verbose=0)
mse = model.evaluate(x, y, verbose=0)
print(mse)
# predict
yhat = model.predict(x)
# inverse transforms
x = x_s.inverse_transform(x)
y = y_s.inverse_transform(y)
yhat = y_s.inverse_transform(yhat)
# plot real vs predicted
pyplot.plot(x,y,label='y')
pyplot.plot(x,yhat,label='yhat')
pyplot.legend()
pyplot.show()

# fit an mlp on x vs x^2

from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential

from keras.layers import Dense

from numpy import asarray

from matplotlib import pyplot

# define data

x = asarray([i for i in range(1000)])

y = asarray([a**2 for a in x])

# reshape into rows and cols

x = x.reshape((len(x), 1))

y = y.reshape((len(y), 1))

# scale data

x_s = MinMaxScaler()

x = x_s.fit_transform(x)

y_s = MinMaxScaler()

y = y_s.fit_transform(y)

# fit a model

model = Sequential()

model.add(Dense(10, input_dim=1, activation='relu'))

model.add(Dense(1))

model.compile(loss='mse', optimizer='adam')

model.fit(x, y, epochs=150, batch_size=10, verbose=0)

mse = model.evaluate(x, y, verbose=0)

print(mse)

# predict

yhat = model.predict(x)

# inverse transforms

x = x_s.inverse_transform(x)

y = y_s.inverse_transform(y)

yhat = y_s.inverse_transform(yhat)

# plot real vs predicted

pyplot.plot(x,y,label='y')

pyplot.plot(x,yhat,label='yhat')

pyplot.legend()

pyplot.show()

The three missing lines were:

# inverse transforms
x = x_s.inverse_transform(x)
y = y_s.inverse_transform(y)
yhat = y_s.inverse_transform(yhat)

# inverse transforms

x = x_s.inverse_transform(x)

y = y_s.inverse_transform(y)

yhat = y_s.inverse_transform(yhat)

Anthony The Koala October 4, 2019 at 9:59 am #

Dear Dr Jason,
I am coming to the conclusion that there must be a bug NOT in your solution and neither in my solution. I think it is coming from a bug in the lower implementation of the language.

I printed the scaled version of yhat, f(x) actual and x and got this.
NOTE the values are the same for the scaled version of yhat.
That is:

model.fit(x, y, epochs=277, batch_size=200, verbose=0)
mse = model.evaluate(x, y, verbose=0)
print("the value of the mse")
print(mse)
# predict
yhat = model.predict(x)

model.fit(x, y, epochs=277, batch_size=200, verbose=0)

mse = model.evaluate(x, y, verbose=0)

print("the value of the mse")

print(mse)

# predict

yhat = model.predict(x)

DESPITE the successful plot of (x, yhat) and (x, f(x),
the resulting output of the first 10 of the scaled output of yhat is the same,

That is we would get a FLAT LINE if we plotted (x, yhat), BUT THE PLOT WAS A PARABOLA.

[0.00161531] [0.] [0.]
[0.00161531] [1.002003e-06] [0.001001]
[0.00161531] [4.00801202e-06] [0.002002]
[0.00161531] [9.01802704e-06] [0.003003]
[0.00161531] [1.60320481e-05] [0.004004]
[0.00161531] [2.50500751e-05] [0.00500501]
[0.00161531] [3.60721081e-05] [0.00600601]
[0.00161531] [4.90981472e-05] [0.00700701]
[0.00161531] [6.41281923e-05] [0.00800801]
[0.00161531] [8.11622433e-05] [0.00900901]

[0.00161531] [0.] [0.]

[0.00161531] [1.002003e-06] [0.001001]

[0.00161531] [4.00801202e-06] [0.002002]

[0.00161531] [9.01802704e-06] [0.003003]

[0.00161531] [1.60320481e-05] [0.004004]

[0.00161531] [2.50500751e-05] [0.00500501]

[0.00161531] [3.60721081e-05] [0.00600601]

[0.00161531] [4.90981472e-05] [0.00700701]

[0.00161531] [6.41281923e-05] [0.00800801]

[0.00161531] [8.11622433e-05] [0.00900901]

When we did the following transforms:

x = x_s.inverse_transform(x)
y = y_s.inverse_transform(y)
yhat = y_s.inverse_transform(yhat)

x = x_s.inverse_transform(x)

y = y_s.inverse_transform(y)

yhat = y_s.inverse_transform(yhat)

WE STILL GOT THE SAME FAULT FOR THE UNSCALED VALUES of yhat. The 2nd column is f(x) and third column is x.

[1612.0857] [0.] [0.]
[1612.0857] [1.] [1.]
[1612.0857] [4.] [2.]
[1612.0857] [9.] [3.]
[1612.0857] [16.] [4.]
[1612.0857] [25.] [5.]
[1612.0857] [36.] [6.]
[1612.0857] [49.] [7.]
[1612.0857] [64.] [8.]
[1612.0857] [81.] [9.]

[1612.0857] [0.] [0.]

[1612.0857] [1.] [1.]

[1612.0857] [4.] [2.]

[1612.0857] [9.] [3.]

[1612.0857] [16.] [4.]

[1612.0857] [25.] [5.]

[1612.0857] [36.] [6.]

[1612.0857] [49.] [7.]

[1612.0857] [64.] [8.]

[1612.0857] [81.] [9.]

Conclusion: It is not a programmatical bug in either your solution or my solution. I believe it may be a lower implementation problem.

Why am I ‘persistent’ in this matter: because in case I have more complex models I want to see the predicted/yhat values that are re-scaled.

I don’t know if there are people at stackexchange who may have an insight.

I appreciate your time, many blessings to you,

Anthony of Sydney

Jason Brownlee October 6, 2019 at 8:05 am #

I believe is correct, given that it is an exponential, the model has decided that it can give up correctness at the low end for correctness at the high end – given the reduction in MSE.

Consider changing the number of examples from 1K to 100, then review all 100 values manually – you’ll see what I mean.

All of this is a good exercise, well done.

Anthony The Koala October 13, 2019 at 10:57 pm #

Dear Dr Jason,
I did this problem again and got very good results!
I cannot explain why I got accurate results, when I expected to get accurate results, BUT they are certainly an improvement.

The rescaled original and fitted values produced an RMS of 0.0.

Here is the code with variable names changed slightly.

from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from numpy import sqrt
from matplotlib import pyplot

x = asarray([i for i in range(100)])
y = asarray([i**2 for i in x])
x = x.reshape((len(x),1))
y = y.reshape((len(y),1))

x_s = MinMaxScaler()
xscaled = x_s.fit_transform(x)
y_s = MinMaxScaler()
yscaled = y_s.fit_transform(y)

model = Sequential()
model.add(Dense(100, input_dim=1, activation = 'relu')
model.add(Dense(1))
model.compile(loss='mse',optimizer='adam')
model.fit(xscaled,yscaled,epochs=150, batch_size=10,verbose=0)

mse= model.evaluate(xscaled,yscaled,verbose=0)
mse
2.9744908551947447e-05

yhat = model.predict(x)
yhat_original = y_s.inverse_transform(yscaled)

#First five elements of predicted values
yhat_original[:5].T
array([[ 0.,  1.,  4.,  9., 16.]])
#First five elements of original y
y[:5].T
array([[ 0,  1,  4,  9, 16]])

#Last five elements of the original series.
y[-5:].T
array([[9025, 9216, 9409, 9604, 9801]])
#Last five elements of predicted values
yhat_original[-5:].T
array([[9025., 9216., 9409., 9604., 9801.]])		      

#Now determining the RMS of the predicted and original values of y
cum_sum = 0
for i in range(len(yhat_original)):
	cum_sum += (yoriginal[i] - yhat_original[i])**2/len(yhat_original)
rms = sqrt(cum_sum)
rms[0]
0.0

#Plotting the rescaled original and rescaled yhat
pyplot.plot(xoriginal,yoriginal,label='y')
pyplot.plot(xoriginal,yhat_original,label='fitted')
pyplot.legend()
pyplot.show()

from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential

from keras.layers import Dense

from numpy import sqrt

from matplotlib import pyplot

x = asarray([i for i in range(100)])

y = asarray([i**2 for i in x])

x = x.reshape((len(x),1))

y = y.reshape((len(y),1))

x_s = MinMaxScaler()

xscaled = x_s.fit_transform(x)

y_s = MinMaxScaler()

yscaled = y_s.fit_transform(y)

model = Sequential()

model.add(Dense(100, input_dim=1, activation = 'relu')

model.add(Dense(1))

model.compile(loss='mse',optimizer='adam')

model.fit(xscaled,yscaled,epochs=150, batch_size=10,verbose=0)

mse= model.evaluate(xscaled,yscaled,verbose=0)

mse

2.9744908551947447e-05

yhat = model.predict(x)

yhat_original = y_s.inverse_transform(yscaled)

#First five elements of predicted values

yhat_original[:5].T

array([[ 0., 1., 4., 9., 16.]])

#First five elements of original y

y[:5].T

array([[ 0, 1, 4, 9, 16]])

#Last five elements of the original series.

y[-5:].T

array([[9025, 9216, 9409, 9604, 9801]])

#Last five elements of predicted values

yhat_original[-5:].T

array([[9025., 9216., 9409., 9604., 9801.]])

#Now determining the RMS of the predicted and original values of y

cum_sum = 0

for i in range(len(yhat_original)):

cum_sum += (yoriginal[i] - yhat_original[i])**2/len(yhat_original)

rms = sqrt(cum_sum)

rms[0]

0.0

#Plotting the rescaled original and rescaled yhat

pyplot.plot(xoriginal,yoriginal,label='y')

pyplot.plot(xoriginal,yhat_original,label='fitted')

pyplot.legend()

pyplot.show()

It works, the rescaled yhat is as expected but cannot explain why it was “cuckoo”, in the previous. More experimentation on this.

Nevertheless, my next project is k-folds sampling on a deterministic function to see if the gaps in the resampled data fold will give us an accurate prediction despite the random sampling in each fold.

Thank you,
Anthony of Sydney

Anthony The Koala October 13, 2019 at 11:44 pm #

Dear Dr Jason,
Apologies, I thought the RMS was ‘unrealistic’. I had a programming error.
Nevertheless, I did it again, and still produced results which looked pleasing.

x = x.reshape((len(x),1))
y = y.reshape((len(y),1))
x_s = MinMaxScaler()
y_s = MinMaxScaler()
x_scaled = x_s.fit_transform(x)
y_scaled = y_s.fit_transform(y)
model = Sequential()
model.add(Dense(100,input_dim=1,activation='relu'))
model.add(Dense(1))

model.compile(loss='mse',optimizer='adam')
model.fit(x_scaled,y_scaled, epochs=100, batch_size=10,verbose=0)

mse = model.evaluate(x_scaled, y_scaled,verbose=0)
mse
1.0475558547113905e-05

yhat = model.predict(x_scaled)
yhat_original = y_s.inverse_transform(yhat)

#First five of yhat_original (yhat rescaled)
yhat_original[:5].T
array([[11.835742, 11.835742, 11.835742, 11.835742, 11.835742]]
#compared to first original 5 elements of y = 0,1,4,9,16

#Last five of yhat_original (yhat rescaled)
yhat_original[-5:].T
array([[8985.839, 9154.454, 9323.067, 9491.684, 9660.3  ]
#compared to last original 5 elements of y = 9025, 9216, 9409, 9604, 9801

#Now determine the RMS of the predicted and original values
cum_sum = 0
for i in range(len(yhat_original)):
	cum_sum+= (y[i]-yhat_original[i])**2/len(yhat_original)
mse = sqrt(cum_sum)
mse
array([31.72189417])
pyplot.plot(x,y,label='y')
pyplot.plot(x,yhat_original,label='estimated')
pyplot.legend()
pyplot.show()

x = x.reshape((len(x),1))

y = y.reshape((len(y),1))

x_s = MinMaxScaler()

y_s = MinMaxScaler()

x_scaled = x_s.fit_transform(x)

y_scaled = y_s.fit_transform(y)

model = Sequential()

model.add(Dense(100,input_dim=1,activation='relu'))

model.add(Dense(1))

model.compile(loss='mse',optimizer='adam')

model.fit(x_scaled,y_scaled, epochs=100, batch_size=10,verbose=0)

mse = model.evaluate(x_scaled, y_scaled,verbose=0)

mse

1.0475558547113905e-05

yhat = model.predict(x_scaled)

yhat_original = y_s.inverse_transform(yhat)

#First five of yhat_original (yhat rescaled)

yhat_original[:5].T

array([[11.835742, 11.835742, 11.835742, 11.835742, 11.835742]]

#compared to first original 5 elements of y = 0,1,4,9,16

#Last five of yhat_original (yhat rescaled)

yhat_original[-5:].T

array([[8985.839, 9154.454, 9323.067, 9491.684, 9660.3 ]

#compared to last original 5 elements of y = 9025, 9216, 9409, 9604, 9801

#Now determine the RMS of the predicted and original values

cum_sum = 0

for i in range(len(yhat_original)):

cum_sum+= (y[i]-yhat_original[i])**2/len(yhat_original)

mse = sqrt(cum_sum)

mse

array([31.72189417])

pyplot.plot(x,y,label='y')

pyplot.plot(x,yhat_original,label='estimated')

pyplot.legend()

pyplot.show()

In sum, the rescaled yhat produced results closer to the original values. The lower values of yhat rescaled appear to be odd.

Despite that the values need to be more realistic at the bottom end even though the plot of the rescaled x & rescaled y, and rescaled x and rescaled yhat look close.

More investigations needed on the batch size, epochs and optimizers.

Next, to do k-folds sampling on a deterministic function to see if the gaps in the resampled data fold will give us an accurate prediction despite the random sampling in each fold.

Again apologies for the mistake in the previous post.

Anthony of Sydney

Jason Brownlee October 14, 2019 at 8:08 am #

Well done.

Anthony The Koala November 21, 2019 at 5:03 am #

Dear Dr Jason,
A person ‘Serali’ a particle physicist relied to me at “StackExchange” replied and suggested that I shuffle the original data. The shuffling of data in this context has nothing to do with the shuffling in k-folds. According to the contributor, the results should improve. Source https://datascience.stackexchange.com/questions/61223/reconstituting-estimated-predicted-values-to-original-scale-from-minmaxscaler

The code is exactly the same as what I was experimenting with. So I will show the necessary code to shuffe at the start and de-shuffle at the end.

Shuffling code at the beginning:

from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from numpy import asarray
from numpy import sqrt
from matplotlib import pyplot

from numpy.random import seed
from numpy.random import shuffle
from numpy.random import sample

import numpy as np

#We will want to reshuffle the data
x = [i for i in range(100)]
y = [i**2 for i in x]

xfx = np.vstack((x,y)).T
xy = xfx
shuffle(xy)

#x = asarray([i for i in range(100)])
#y = asarray([i**2 for i in x])

#x = asarray(xy[:,0]);#x.reshape((len(x),1))
x = np.reshape(xy[:,0], (100,1))
#print('debug, size x = %d ' + str(np.shape(x)))
y = np.reshape(xy[:,1], (100,1))
#y = asarray(xy[:,1]);#y.reshape((len(y),1))
#print('debug, size y = %d ' + str(np.shape(y)))
x_s = MinMaxScaler()
y_s = MinMaxScaler()
x_scaled = x_s.fit_transform(x)
y_scaled = y_s.fit_transform(y)
#The rest is fed into model
......
.......
yhat = model.predict(x_scaled)
yhat_original = y_s.inverse_transform(yhat)

from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential

from keras.layers import Dense

from numpy import asarray

from numpy import sqrt

from matplotlib import pyplot

from numpy.random import seed

from numpy.random import shuffle

from numpy.random import sample

import numpy as np

#We will want to reshuffle the data

x = [i for i in range(100)]

y = [i**2 for i in x]

xfx = np.vstack((x,y)).T

xy = xfx

shuffle(xy)

#x = asarray([i for i in range(100)])

#y = asarray([i**2 for i in x])

#x = asarray(xy[:,0]);#x.reshape((len(x),1))

x = np.reshape(xy[:,0], (100,1))

#print('debug, size x = %d ' + str(np.shape(x)))

y = np.reshape(xy[:,1], (100,1))

#y = asarray(xy[:,1]);#y.reshape((len(y),1))

#print('debug, size y = %d ' + str(np.shape(y)))

x_s = MinMaxScaler()

y_s = MinMaxScaler()

x_scaled = x_s.fit_transform(x)

y_scaled = y_s.fit_transform(y)

#The rest is fed into model

......

.......

yhat = model.predict(x_scaled)

yhat_original = y_s.inverse_transform(yhat)

The end code was ‘unshuffled’/sorted in order to display the difference between the actual and predicted.

#Plotting dots instead of lineplot otherwise we get a zig-zag plot 
pyplot.plot(x,y,'r.',label='y')
pyplot.plot(x,yhat_original,'b.', label='estimated')

#printing the first values - we have to sort the values in order to see them in 
#their proper context.
xy = np.vstack((x[:,0],y[:,0])).T
xyhat = np.vstack((x[:,0],yhat_original[:,0])).T

xyy = np.sort(xy,axis=0)
xyhatt = np.sort(xyhat,axis=0)

print("printing x, y, yhat")
for loop in range(10):
        print(xyy[loop,0],xyy[loop,1],xyhatt[loop,1])


pyplot.legend()
pyplot.show()

#Plotting dots instead of lineplot otherwise we get a zig-zag plot

pyplot.plot(x,y,'r.',label='y')

pyplot.plot(x,yhat_original,'b.', label='estimated')

#printing the first values - we have to sort the values in order to see them in

#their proper context.

xy = np.vstack((x[:,0],y[:,0])).T

xyhat = np.vstack((x[:,0],yhat_original[:,0])).T

xyy = np.sort(xy,axis=0)

xyhatt = np.sort(xyhat,axis=0)

print("printing x, y, yhat")

for loop in range(10):

print(xyy[loop,0],xyy[loop,1],xyhatt[loop,1])

pyplot.legend()

pyplot.show()

Here is a listing of x, f(x) and yhat

printing x, y, yhat
0 0 1.4915295839309692
1 1 2.66086745262146
2 4 4.75526237487793
3 9 9.125076293945312
4 16 15.723174095153809
5 25 24.287418365478516
6 36 35.04938507080078
7 49 47.73912811279297
8 64 62.95930480957031
9 81 80.16889190673828

printing x, y, yhat

0 0 1.4915295839309692

1 1 2.66086745262146

2 4 4.75526237487793

3 9 9.125076293945312

4 16 15.723174095153809

5 25 24.287418365478516

6 36 35.04938507080078

7 49 47.73912811279297

8 64 62.95930480957031

9 81 80.16889190673828

Things to improve:
* adjusting the number of layers.
* adjusting how many neurons in each layer
* adjusting the batch size
* adjusting the epoch size
In addition
* look at k-folds for further model refinement.

Thank you
Anthony of Sydney

Anthony The Koala November 24, 2019 at 4:12 pm #

Dear Dr Jason,
Here is an even improved version with very close results.
Instead of MinMaxScaler, I took the logs (to the base e) of the inputs x and f(x) applied my model, then retransformed my model to its original values.

Snippets of code transforming the data

#We will want to reshuffle the data
x = [i for i in range(100)]
y = [i**2 for i in x]

xfx = np.vstack((x,y)).T
xy = xfx
seed(1)
shuffle(xy)
 
x = np.reshape(xy[:,0], (100,1))
print('debug, size x = %d ' + str(np.shape(x))) #shape is (100,1)
y = np.reshape(xy[:,1], (100,1))

print('debug, size y = %d ' + str(np.shape(y)))
#x_s = MinMaxScaler()
#x_s = MinMaxScaler(feature_range =(0,200))
#y_s = MinMaxScaler()
#x_scaled = x_s.fit_transform(x)
#y_scaled = y_s.fit_transform(y)
x_scaled = np.log(x+1); # we add 1 so as not to have an error as log(0) produces an error
y_scaled = np.log(y+1); # we add 1 so as not to have an error as log(0) produces an error

model = Sequential()
...# the model is applied on the transformed data

#We will want to reshuffle the data

x = [i for i in range(100)]

y = [i**2 for i in x]

xfx = np.vstack((x,y)).T

xy = xfx

seed(1)

shuffle(xy)

x = np.reshape(xy[:,0], (100,1))

print('debug, size x = %d ' + str(np.shape(x))) #shape is (100,1)

y = np.reshape(xy[:,1], (100,1))

print('debug, size y = %d ' + str(np.shape(y)))

#x_s = MinMaxScaler()

#x_s = MinMaxScaler(feature_range =(0,200))

#y_s = MinMaxScaler()

#x_scaled = x_s.fit_transform(x)

#y_scaled = y_s.fit_transform(y)

x_scaled = np.log(x+1); # we add 1 so as not to have an error as log(0) produces an error

y_scaled = np.log(y+1); # we add 1 so as not to have an error as log(0) produces an error

model = Sequential()

...# the model is applied on the transformed data

The

#We need to resort the numbers
#in order to print the first 10 values
xy = np.vstack((x[:,0],y[:,0])).T
xyhat = np.vstack((x[:,0],yhat_original[:,0])).T

xyy = np.sort(xy,axis=0)
xyhatt = np.sort(xyhat,axis=0)
print("printing x, y, yhat")
for loop in range(10):
        print(xyy[loop,0],xyy[loop,1],xyhatt[loop,1])

#want to predict for the values 100 and 200
Xnew = np.reshape([100,200],(2,1))

print("let's predict for values 100 and 200")
print("the values of x = Xnew before transform %s, %s " % (Xnew[0],Xnew[1]))

Xnew = np.log(Xnew+1)
print("values of scaled xnew to put into the model %s, %s " % (Xnew[0],Xnew[1]))
ynew = model.predict(Xnew)

#Re-transform  the original values
ynew = np.exp(ynew) - 1
print("The values of Xnew and its predicted yhat")
for loop in range(len(Xnew)):
        print("Xnew[%s] = %s, ynew[%s] = %s " % (loop,Xnew[loop],loop,ynew[loop]))

#We need to resort the numbers

#in order to print the first 10 values

xy = np.vstack((x[:,0],y[:,0])).T

xyhat = np.vstack((x[:,0],yhat_original[:,0])).T

xyy = np.sort(xy,axis=0)

xyhatt = np.sort(xyhat,axis=0)

print("printing x, y, yhat")

for loop in range(10):

print(xyy[loop,0],xyy[loop,1],xyhatt[loop,1])

#want to predict for the values 100 and 200

Xnew = np.reshape([100,200],(2,1))

print("let's predict for values 100 and 200")

print("the values of x = Xnew before transform %s, %s " % (Xnew[0],Xnew[1]))

Xnew = np.log(Xnew+1)

print("values of scaled xnew to put into the model %s, %s " % (Xnew[0],Xnew[1]))

ynew = model.predict(Xnew)

#Re-transform the original values

ynew = np.exp(ynew) - 1

print("The values of Xnew and its predicted yhat")

for loop in range(len(Xnew)):

print("Xnew[%s] = %s, ynew[%s] = %s " % (loop,Xnew[loop],loop,ynew[loop]))

The resulting output: Note how close the actual f(x) is to the predicted f(x)

printing x, y, yhat
0 0 0.00208890438079834
1 1 0.9818048477172852
2 4 4.111057281494141
3 9 9.025933265686035
4 16 15.918327331542969
5 25 24.944564819335938
6 36 36.00426483154297
7 49 49.05435562133789
8 64 63.969764709472656
9 81 80.93276977539062

let's predict for values 100 and 200
the values of x = Xnew before transform [100], [200] 
values of scaled xnew to put into the model [4.61512052], [5.30330491] 

The values of Xnew and its predicted yhat
Xnew[0] = [100.], ynew[0] = [10008.037] 
Xnew[1] = [200.], ynew[1] = [40082.062]

printing x, y, yhat

0 0 0.00208890438079834

1 1 0.9818048477172852

2 4 4.111057281494141

3 9 9.025933265686035

4 16 15.918327331542969

5 25 24.944564819335938

6 36 36.00426483154297

7 49 49.05435562133789

8 64 63.969764709472656

9 81 80.93276977539062

let's predict for values 100 and 200

the values of x = Xnew before transform [100], [200]

values of scaled xnew to put into the model [4.61512052], [5.30330491]

The values of Xnew and its predicted yhat

Xnew[0] = [100.], ynew[0] = [10008.037]

Xnew[1] = [200.], ynew[1] = [40082.062]

Jason Brownlee November 25, 2019 at 6:21 am #

Nice work.

kamu October 6, 2019 at 7:51 pm #

Hi Jason,
Thank you very much for “Your First Deep Learning Project in Python with Keras Step-By-Step” tutorial. It is very useful for me. I want to ask you:
Can I code:

model.add(Dense(8)) # input layer
model.add(Dense(12, activation=’relu’)) # first hidden layer

Instead of:

model.add(Dense(12, input_dim=8, activation=’relu’)) # input layer and first hidden layer

Sincerely.

Reply
- Jason Brownlee October 7, 2019 at 8:29 am #
  
  No.
  
  The input_dim argument defines the input layer.
  
  Reply
keryums October 17, 2019 at 1:22 am #

Hi Jason, is it not necessary to use the keras utilility ‘to_categorical’ to convert your y vector into a matrix before fitting the model?

Reply
- Jason Brownlee October 17, 2019 at 6:37 am #
  
  You can, or you can use the sklearn tools to do the same thing.
  
  Reply
Aquilla Setiawan Kanadi October 17, 2019 at 6:35 am #

Hi Jason,

Thanks a lot for your tutorial about deep learning project, it really help me a lot in my journey to learn machine learning.

I have a question about the data splitting in code above, how is the splitting work between data for training and the data for validate the training data? I’ve tried to read your tutorial about the data splitting but i have no ideas about the data splitting work above.

Thankyou,

Aquilla

Reply
- Jason Brownlee October 17, 2019 at 6:47 am #
  
  We did not split the data, we fit and evaluated on one set. We did this for brevity.
  
  Reply
Love your work! October 17, 2019 at 11:43 am #

Hi Jason,

I just wanted to thank you. This tutorial is incredibly clear and well presented. Unlike many other online tutorials you explain very eloquently the intuition behind the lines of code and what is being accomplished which is very useful. As someone just starting out with Keras I had been finding some of the coding, as well as how Keras and Tensorflow interact, confusing. After your explanations Keras seems incredibly basic. I’ve been looking over some of my recent code from other Keras tutorials and I now understand how everything works.

Thanks again!

Reply
- Jason Brownlee October 17, 2019 at 1:50 pm #
  
  Well done on your progress and thanks for your support!
  
  Reply
Ahmed October 19, 2019 at 6:16 am #

Dear Jason. I am deeply grateful to this amazing work. Everything works well so far. King Regards

Reply
- Jason Brownlee October 19, 2019 at 6:55 am #
  
  Thanks, well done on your progress!
  
  Reply
JAMES JONAH October 28, 2019 at 10:56 am #

Please i need help, which algorithms is the best in cyber threat detection and how to implement it. thanks

Reply
- Jason Brownlee October 28, 2019 at 1:18 pm #
  
  This is a common question that I answer here:
  https://machinelearningmastery.com/faq/single-faq/what-algorithm-config-should-i-use
  
  Reply
shivan October 29, 2019 at 7:01 am #

hello sir
do you have an implementation about (medical image analysis with deep learning).
i need to start with medical image NOT real world image
thanks for your help.

Reply
- Jason Brownlee October 29, 2019 at 1:47 pm #
  
  Not really, sorry.
  
  Reply
  - shivan October 31, 2019 at 9:18 am #
    
    so, what do you recommend me about it
    thanks.
    
    Reply
    - Jason Brownlee October 31, 2019 at 1:36 pm #
      
      Perhaps start by collecting a dataset.
      
      Then consider reviewing the literature to see what types of data prep and models other have used for similar data.
      
      Reply
Nasir Shah October 30, 2019 at 7:27 am #

Sir. i am new to neural network. so from where i start it. or which tutorial i watch . i didn’t have any idea about it.

Reply
- Jason Brownlee October 30, 2019 at 1:55 pm #
  
  Yes, you can start here:
  https://machinelearningmastery.com/start-here/#deeplearning
  
  Reply
hima hansi November 3, 2019 at 1:35 pm #

hello sir, I’m new to this field. I’m going to develop monophonic musical instrument classification system using python and Keras. sir,I want to find monophonic data set, how can I find it.
I try to get piano music from you tube and convert it to .waw file and splitting it. Is it a good or bad ? or an other methods available to get free data set on the web.. give your suggestions please ??

Reply
- Jason Brownlee November 4, 2019 at 6:37 am #
  
  Perhaps this will help:
  https://machinelearningmastery.com/faq/single-faq/where-can-i-get-a-dataset-on-___
  
  Reply
Mona Ahmed November 20, 2019 at 3:14 am #

i got score 76.69

Reply
- Jason Brownlee November 20, 2019 at 6:20 am #
  
  Well done!
  
  Reply
Niall Xie November 26, 2019 at 8:26 am #

Hello, I just want to say that I am elated to use your tutorial. So, I am working on a group project with my team and I used datasets representing heart disease, diabetes and breast cancer for this tutorial. However, this code example will give an error when the cell contains a string value, in this case… title names like clump_thickess and ? will produce an error. how do I fix this?

Reply
- Jason Brownlee November 26, 2019 at 1:28 pm #
  
  Thanks.
  
  Perhaps try encoding your categories using a one hot encoding first:
  https://machinelearningmastery.com/how-to-prepare-categorical-data-for-deep-learning-in-python/
  
  Reply
Mohamed November 28, 2019 at 10:46 pm #

thank you sir for this article, would you please suggest an example with testing data ?

Reply
- Jason Brownlee November 29, 2019 at 6:49 am #
  
  Sorry I don’t understand your question, can you elaborate?
  
  Reply
Chris December 3, 2019 at 10:49 pm #

I believe there is something wrong with the (150/10) 15 updates to the model weights. The internal coefficients are updated after every single batch. Our data is comprised of 768 samples. Since batch_size=10, we obtain 77 batches (76 with 10 samples and one with 8). Therefore, at each epoch we should see 77 updates of weights and coefficients and not 15. Moreover, the total number of updates must be: 150*77=11550. Am I missing something important?

Really good job and very well-written article (all your articles). Keep up the good job. Cheers

Reply
- Jason Brownlee December 4, 2019 at 5:37 am #
  
  You’re right. Not sure what I was thinking there. Simplified.
  
  Reply
Justine December 14, 2019 at 9:58 am #

Thanks! This is my first foray into keras, and the tutorial went swimmingly. Am now training on my own data. It is not performing worse than on my other machine learning models (that’s a win :).

Reply
- Jason Brownlee December 15, 2019 at 6:02 am #
  
  Well done!
  
  Reply
x December 17, 2019 at 8:37 am #

Hi，Jason. Thanks so much for your answer. Now my question is why I can’t found my directory in Jupyter and put the ‘pima-indians-diabetes.csv’ in it.
OSError Traceback (most recent call last)
in
4 from keras.layers import Dense
5 # load the dataset
—-> 6 dataset = loadtxt(‘pima-indians-diabetes.csv’, delimiter=’,’)
7 # split into input (X) and output (y) variables
8 X = dataset[:,0:8]

D:\anaconda\lib\site-packages\numpy\lib\npyio.py in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack, ndmin, encoding, max_rows)
966 fname = os_fspath(fname)
967 if _is_string_like(fname):
–> 968 fh = np.lib._datasource.open(fname, ‘rt’, encoding=encoding)
969 fencoding = getattr(fh, ‘encoding’, ‘latin1’)
970 fh = iter(fh)

D:\anaconda\lib\site-packages\numpy\lib\_datasource.py in open(path, mode, destpath, encoding, newline)
267
268 ds = DataSource(destpath)
–> 269 return ds.open(path, mode, encoding=encoding, newline=newline)
270
271

D:\anaconda\lib\site-packages\numpy\lib\_datasource.py in open(self, path, mode, encoding, newline)
621 encoding=encoding, newline=newline)
622 else:
–> 623 raise IOError(“%s not found.” % path)
624
625

OSError: pima-indians-diabetes.csv not found.

Reply
- Jason Brownlee December 17, 2019 at 1:36 pm #
  
  Perhaps try running the code file from the command line, as follows:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-run-a-script-from-the-command-line
  
  Reply
Manohar Nookala December 22, 2019 at 9:32 pm #

Hi sir,
My name is manohar. i trained a deep learning model on car price prediction. i got

loss: nan – acc: 0.0000e+00. if you give me your email ID then i will send you. you can tell me the problem. please do this help because i am a beginner.

Reply
- Jason Brownlee December 23, 2019 at 6:48 am #
  
  Perhaps you need to scale the data prior to fitting?
  Perhaps you need to use relu activation?
  Perhaps you need some type of regularization?
  Perhaps you need a larger or smaller model?
  
  Reply
Shone Xu January 5, 2020 at 1:24 am #

Hi Jason,

thanks and it is a great tutorial. just 1 question. do we have to train the model by “model.fit(x, y, epochs=150, batch_size=10)” every time before making the prediction because it takes a very long time to train the model. I am just wondering whether it is possible to save the trained model and go straight to the prediction skipping the model.fit (eg: pickle)?

many thanks for your advice in advance

cheers

Reply
- Jason Brownlee January 5, 2020 at 7:06 am #
  
  No, you can fit the model once, then save it:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  Then later load it and make predictions.
  
  Reply
  - Shone Xu January 7, 2020 at 2:16 pm #
    
    Thanks and will check it out
    
    Reply
ustengg January 8, 2020 at 7:40 pm #

Thank you so much for this tutorial sir but How can I use the model to predict using data outside the dataset?

Reply
- Jason Brownlee January 9, 2020 at 7:24 am #
  
  Call model.predict() with the new inputs.
  
  See the “Make Predictions” section.
  
  Reply
  - ustengg January 9, 2020 at 4:07 pm #
    
    Nice! Thank you so much, Sir. I figured it out using the link on the “Make predictions” section. I’ve learned a lot from your tutorials. You’re the best!
    
    Reply
    - Jason Brownlee January 10, 2020 at 7:22 am #
      
      Nice work!
      
      Thanks.
      
      Reply
monica January 23, 2020 at 4:00 am #

Hi Jason,

Thanks for sharing this post.

I have a question, when I tried to split the dataset
(X = dataset[:,0:8]
y = dataset[:,8])

it gives me an error: TypeError: ‘(slice(None, None, None), slice(0, 8, None))’ is an invalid key

how can I fix it?

Thanks,

monica

Reply
- Jason Brownlee January 23, 2020 at 6:41 am #
  
  Sorry to hear that, this might help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Sam Sarjant January 23, 2020 at 9:11 pm #

Thanks for the tutorial! This is a wonderful ‘Hello World’ to Deep Learning

Reply
- Jason Brownlee January 24, 2020 at 7:51 am #
  
  Thanks, I’m happy it was helpful.
  
  Reply
Keerthan January 24, 2020 at 4:01 pm #

Hello Jason! hope you are doing good.
I am actually doing a project on classification of thyroid disease using back propagation with stocastic gradient descent method,can you help me out with the code a little bit?

Reply
- Jason Brownlee January 25, 2020 at 8:31 am #
  
  Perhaps start by adapting the code in the above tutorial?
  
  Reply
Shakir January 25, 2020 at 1:29 am #

Dear Sir
I want to predict air pollution using deep learning techniques please suggest how to go about with my data sets

Reply
- Jason Brownlee January 25, 2020 at 8:39 am #
  
  Start here:
  https://machinelearningmastery.com/start-here/#deep_learning_time_series
  
  Reply
Yared February 7, 2020 at 4:36 pm #

AttributeError: module ‘tensorflow’ has no attribute ‘get_default_graph’AttributeError: module ‘tensorflow’ has no attribute ‘get_default_graph’

Reply
- Jason Brownlee February 8, 2020 at 7:05 am #
  
  Perhaps confirm you are using TF 2 and Keras 2.3.
  
  Reply
Yared February 7, 2020 at 4:41 pm #

I went to detect agreement errors in a sentence using LSTM techniques please suggest how to go about with my data sets

Reply
- Jason Brownlee February 8, 2020 at 7:05 am #
  
  You can get started with NLP problems here:
  https://machinelearningmastery.com/start-here/#nlp
  
  Reply
Pavitra Nayak February 29, 2020 at 2:55 pm #

Hello Jason
I am using this code for my project. It works perfectly for your dataset. But I have a dataset which has too many 0’s and 1’s. So I am getting the wrong prediction. What can I do to solve this problem?

Reply
- Jason Brownlee March 1, 2020 at 5:22 am #
  
  Here are some suggestions:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
nurul March 6, 2020 at 5:50 pm #

hi. I wanna ask. i had follow all the steps but i’m stuck at the fit the model. This error occured. How can I solve this problem?

Reply
- kiki March 6, 2020 at 6:44 pm #
  
  I have already tried this step and stuck at the fit phase and got this error. Do you have any solution for my problem?
  
  —————————————————————————
  ValueError Traceback (most recent call last)
  in
  1 # fit the keras model on the dataset
  —-> 2 model.fit(x, y, batch_size=10,epochs=150)
  
  ~\Anaconda4\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
  1152 sample_weight=sample_weight,
  1153 class_weight=class_weight,
  -> 1154 batch_size=batch_size)
  1155
  1156 # Prepare validation data.
  
  ~\Anaconda4\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
  577 feed_input_shapes,
  578 check_batch_axis=False, # Don’t enforce the batch size.
  –> 579 exception_prefix=’input’)
  580
  581 if y is not None:
  
  ~\Anaconda4\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
  143 ‘: expected ‘ + names[i] + ‘ to have shape ‘ +
  144 str(shape) + ‘ but got array with shape ‘ +
  –> 145 str(data_shape))
  146 return data
  147
  
  ValueError: Error when checking input: expected dense_133_input to have shape (16,) but got array with shape (17,)
  
  Reply
  - Jason Brownlee March 7, 2020 at 7:15 am #
    
    Perhaps this will help you copy the code from the tutorial:
    https://machinelearningmastery.com/faq/single-faq/how-do-i-copy-code-from-a-tutorial
    
    Reply
- Jason Brownlee March 7, 2020 at 7:13 am #
  
  I’m sorry to hear that, perhaps this will help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
  - kiki March 9, 2020 at 12:18 pm #
    
    Thanks for the answer jason
    
    Reply
    - Jason Brownlee March 10, 2020 at 5:34 am #
      
      You’re welcome.
      
      Reply
laz March 7, 2020 at 2:59 pm #

Hey, Jason!

Again… Thanks for your awesome tutorials and for giving your knowledge to the public! >800 comments and nearly all answered, you’re great. I can’t understand how you manage all that, writing great content, do ml stuff, teach, learn, great respect!

2 general questions:

Question(1):

Why and when do we need to flatten() inputs and in which cases not?

For example 4 numeric inputs, a lag of 2 of every input means 4*2=8 values per batch:

I always do this, no matter how many inputs or lags, i give that as flat array to the input:

1 set/batch: [[1.0,1.1, 2.0,2.1, 3.0,3.1, 4.0,4.1]]

Input(shape=(8,)) # keras func api

Does it make sense to input a structure like this, if so – why/when?

Better? [[[1.0,1.1], [2.0,2.1], [3.0,3.1], [4.0,4.1]]]

Question(2):

Are you still using Theano? As they do not update it, it becomes older, but not worse ;). I tried Tensorflow a lot – but always with lower performance in terms of speed. Theano is much faster (factor 3-10) for me. But using more than 1 core is always slower for me, in both theano and tf. Did you experienced similar things? I also tried torch, nice but it was also slower as the good old theano. Any ideas or alternatives (i can’t use gpu/external/aws)?

I would be happy to see you doing some deep reinforcement learning (DRL) stuff, what do you think? Are you?

Regards, keep it up 😉

Reply
- Jason Brownlee March 8, 2020 at 6:07 am #
  
  You need to flatten when the output shape of one layer does not match the input shape of another, e.g. CNN output to a Dense.
  
  No. I use and recommend tensorflow and have for years. Tensorflow used to not work for windows users, so I recommend theano for them – and still do if they have trouble. Theano works fine and will continue to work fine for most applications.
  
  No, RL is not practical/useful:
  https://machinelearningmastery.com/faq/single-faq/do-you-have-tutorials-on-deep-reinforcement-learning
  
  Reply
laz March 8, 2020 at 11:29 am #

Dear Jason, thanks for your answer ;)…

“flatten when the output shape of one layer does not match the input shape of another, e.g. CNN output to a Dense.”

Thanks. The question about the “flatten” operation was not about the flatten() between layers, it was about how to present inputs to the input layer. Sorry for being vague. Maybe I misunderstood something, are there use cases where the FEATURES/INPUTS/LAGS are not flattened?

“RL is not practical/useful”
Is this statement based on your experience or do you take the opinion of others without checking it yourself here ;)? Please do not misunderstand, you are the expert here. However, i can refute some arguments against RL.

Rewards are hard to create: depends on your environment
Unstable: depends on your environment, code, setup

I started experimenting with a simple DQN, I expanded it step by step and now I have a “Dueling Double DQN”. It learns well and quick. I admit – on simple data. But it does it repeatable and reproducible! So i would say: In general, it works.

I have to see how it works with more complicated data. That is why I emphasized that the performance of this method strongly depends on the area of application.

But there is a huge problem, most public sources contain incorrect code or incorrect implementations. I have never reported or found so many bugs on any subject. These errors are copied again and again and in the end many think that they are correct. I have collected tons of links and pdf files to understand and debug this beast.

No matter, you have to decide for yourself. If you want to take a look at it, take a simple example, even the DQN (without dueling or double) is able to learn – if the code is correct. And although I’m not a mathematician: to understand how it works and what possibilities it offers – made me smile 😉 …

Reply
- Jason Brownlee March 9, 2020 at 7:14 am #
  
  For more on the input shape of LSTMs/1d CNNs, see this:
  https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-timesteps-and-features-for-lstm-input
  
  I don’t yet see an ROI for “developers at work” in covering RL as described in the link.
  
  Reply
laz March 8, 2020 at 10:44 pm #

Interesting read:

“We use a double deep Q-learning network (DDQN) to find the right material type and the optimal geometrical design for metasurface holograms to reach high efficiency. The DDQN acts like an intelligent sweep and could identify the optimal results in ~5.7 billion states after only 2169 steps. The optimal results were found between 23 different material types and various geometrical properties for a three-layer structure. The computed transmission efficiency was 32% for high-quality metasurface holograms; this is two times bigger than the previously reported results under the same conditions.”

https://www.nature.com/articles/s41598-019-47154-z

Reply
- Jason Brownlee March 9, 2020 at 7:16 am #
  
  Thanks for sharing.
  
  Reply
YzN March 11, 2020 at 4:01 am #

Literally the best “first neural network tutorial”
Got 85.68 acc by adding layers and decreasing batch size

Reply
- Jason Brownlee March 11, 2020 at 5:29 am #
  
  Thanks.
  
  Well done!
  
  Reply
Neha March 14, 2020 at 12:05 am #

Hello Jason,
I have a quick question.
I am trying to build just 1 sigmoid neuron for a binary classification task, basically I am implying this is how 1 sigmoid model is:

model = Sequential()
model.add(Dense(1, activation=’sigmoid’))

My inputs are images of size = (39*39*3)

I am unsure as to how to input these images to my Dense layer (which is the only layer I am using)

I am currently using below for inputting my images:

train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size=(39, 39),
batch_size=batch_size)
class_mode=’binary’)

But somehow Dense layer cannot accept input shape (39, 39, 3).

So my question is, how do I input my images data to the Dense layer?

Reply
- Jason Brownlee March 14, 2020 at 8:13 am #
  
  You can flatten the input or use a CNN as the input instead that is designed for 3d input samples.
  
  Reply
Bertrand Bru March 29, 2020 at 12:38 am #

Hi Jason,

Thank you very much for your tutorial.

I am new in the world of deep leraning. I have been able to modify your code and make it work for a set of data I recorded with a 3 axis accelerometer. My goal was to detect if I was walking or running. I recorded around 50 trials of each activities. From the signal, I calculated specific parameters that enable the code to differenciate the two activities. Amongst the parameters, I calculated for all axis, the mean, min and max values, and some parameters in the domain frequencies (the 3 first peak of the power spectrum and their respective position).

It works very well and I am able to easily detect if I am running or walking.

I then decided to add a thrid activities: standing. I also recorded 50 trials of this activity. If I train my model with standing and running, I can identify the two activity. Same if I train it with standing and walking or with walking and running.

It is more complicated if I train my model with the three activities. In fact, it can’t do it. It can only recgonise the first two activities. So for example if standing, walking and running have the following ID: 0, 1 and 2, then it can only detect 0 and 1 (standing and walking). It thinks that all running trials are walking trials. If standing, running and walinking have the following ID: 0, 1 and 2, then it can only detect 0 and 1 (standing and running). It thinks that all walking trials are running trials.

So here is my question: Assuming you have the dataset, if you needed to adapt your code so it can detect if people are 0: not diabetic, 1: people are diabetic type 1, and 2: people are diabetic type 2, how would you modify your script?

Thank you very much for your help.

Reply
- Jason Brownlee March 29, 2020 at 6:00 am #
  
  You’re welcome.
  
  Well done.
  
  This is called multi-class classification, this tutorial will help:
  https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
  
  Reply
  - Bertrand Bru March 29, 2020 at 7:15 am #
    
    Thank you so much for coming to me so quickly.
    This is exactly what I was looking for.
    Cheers,
    
    Reply
    - Jason Brownlee March 30, 2020 at 5:27 am #
      
      You’re welcome.
      
      I’m happy to hear that.
      
      Reply
Dipak Kambale March 31, 2020 at 10:16 pm #

Hi Jason,

I got accuracy 75.52 . Is it ok?? please let me know

Reply
- Jason Brownlee April 1, 2020 at 5:49 am #
  
  Well done. Try running the example a few times.
  
  Reply
islamuddin April 1, 2020 at 6:20 pm #

hello sir jason.
sir how to satiable accuracy run the cod one given out for example 86% next time 82% how to solve this!

#import
from numpy import loadtxt
from keras.models import Sequential
from keras.layers import Dense

# load the dataset
dataset = loadtxt(‘E:/ms/impotnt/iwp1.csv’, delimiter=’,’)
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]
# define the keras model

#model = Sequential()
model = Sequential()
#model.add(Dense(25, input_dim=8, init=’uniform’, activation=’relu’))
model.add(Dense(30, input_dim=8, activation=’relu’))
model.add(Dense(95, activation=’relu’))
model.add(Dense(377, activation=’relu’))
model.add(Dense(233, activation=’relu’))
model.add(Dense(55, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))

# compile the keras model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(X, y)
print(‘Accuracy: %.2f’ % (accuracy*100))

output

0.1153 – accuracy: 0.9531
Epoch 149/150
768/768 [==============================] – 0s 278us/step – loss: 0.1330 – accuracy: 0.9401
Epoch 150/150
768/768 [==============================] – 0s 277us/step – loss: 0.1468 – accuracy: 0.9375
768/768 [==============================] – 0s 41us/step
Accuracy: 94.01

Reply
- Jason Brownlee April 2, 2020 at 5:46 am #
  
  This is a common question that I answer here:
  https://machinelearningmastery.com/faq/single-faq/can-you-read-review-or-debug-my-code
  
  Reply
M Husnain Ali Nasir April 3, 2020 at 2:55 am #

Traceback (most recent call last):
File “keras_first_network.py”, line 7, in
dataset = loadtxt(‘pima-indians-diabetes.csv’, delimiter=’,’)
File “C:\Users\Hussnain\anaconda3\lib\site-packages\numpy\lib\npyio.py”, line 1159, in loadtxt
for x in read_data(_loadtxt_chunksize):
File “C:\Users\Hussnain\anaconda3\lib\site-packages\numpy\lib\npyio.py”, line 1087, in read_data
items = [conv(val) for (conv, val) in zip(converters, vals)]
File “C:\Users\Hussnain\anaconda3\lib\site-packages\numpy\lib\npyio.py”, line 1087, in
items = [conv(val) for (conv, val) in zip(converters, vals)]
File “C:\Users\Hussnain\anaconda3\lib\site-packages\numpy\lib\npyio.py”, line 794, in floatconv
return float(x)
ValueError: could not convert string to float: ‘”6’

I AM HAVIN THE ABOVE ERROR WHILE RUNNING IT PLEaSE HELP. I am using Anaconda 3 , Python 3.7 , tensorflow ,keras

Reply
- Jason Brownlee April 3, 2020 at 6:57 am #
  
  Sorry to hear that, this will help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Madhawa Akalanka April 9, 2020 at 6:22 pm #

(base) C:\Users\Madhawa Akalanka\python codes>python keras_first_network.py
Using TensorFlow backend.
2020-04-09 13:42:28.003791: I tensorflow/core/platform/cpu_feature_guard.cc:142]
Your CPU supports instructions that this TensorFlow binary was not compiled to
use: AVX AVX2
2020-04-09 13:42:28.014066: I tensorflow/core/common_runtime/process_util.cc:147
] Creating new thread pool with default inter op setting: 2. Tune using inter_op
_parallelism_threads for best performance.
Traceback (most recent call last):
File “keras_first_network.py”, line 12, in
model.fix(X,Y,epochs=150,batch_size=10)
AttributeError: ‘Sequential’ object has no attribute ‘fix’

I had this error while it’s being run. please help.

Reply
- Jason Brownlee April 10, 2020 at 8:25 am #
  
  Sorry to hear that, see this:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Rahim Dehkharghani April 14, 2020 at 2:05 am #

Dear Jason
Thanks for your wonderful website and books. I am a PhD holder and one of your fans in Deep Learning. Sometimes I get disappointed because I cannot achieve my goal in this area. My goal is to discover something new and publish it. Although I understand your codes mostly but having contribution in this field is difficult and requires understanding the whole theory which I have not been able to do so far. Can you please give me some tips to continue? Thanks a lot

Reply
- Jason Brownlee April 14, 2020 at 6:25 am #
  
  You’re welcome.
  
  Keep working on it every day. That’s my best advice.
  
  Reply
MattGurney April 16, 2020 at 10:52 pm #

There is a typo “input to the model lis defined”

Reply
- Jason Brownlee April 17, 2020 at 6:21 am #
  
  Thanks! Fixed.
  
  Reply
MattGurney April 16, 2020 at 11:28 pm #

Using the latest libraries today I get a number of warnings due to latest numpy: 1.18.1 not being compatible with latest TensorFlow: 1.13.1.

i.e:
FutureWarning: Passing (type, 1) or ‘1type’ … (6 times)
to_int32 (from tensorflow.python.ops.math_ops) is deprecated

Options are to revert to an older numpy or suppress the warnings, I took the suppress route with this code:

# first neural network with keras tutorial

# Suppress warnings due to TF / numpy version incompatibility: https://github.com/tensorflow/tensorflow/issues/30427#issuecomment-527891497
import warnings
warnings.filterwarnings(‘ignore’, category=FutureWarning)

import tensorflow

# Suppress warning from TF: to_int32 (from tensorflow.python.ops.math_ops) is deprecated: https://github.com/aamini/introtodeeplearning/issues/25#issuecomment-578404772
import logging
logging.getLogger(‘tensorflow’).setLevel(logging.ERROR)

import keras
from numpy import loadtxt
from keras.models import Sequential
from keras.layers import Dense

Reply
- Jason Brownlee April 17, 2020 at 6:21 am #
  
  I recommend using Keras 2.3 and TensorFlow 2.1.
  
  Reply
  - MattGurney April 17, 2020 at 12:18 pm #
    
    Yes, upgrading to tensorFlow 2.1 fixed it, I have now removed my warnings suppression and I don’t see the warnings in the output
    
    I upgraded TF like this:
    pip install –upgrade tensorflow
    
    I did follow your installation instructions from https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/ and ended up with TF version 1.13.1. The command I ran was:
    conda install -c conda-forge tensorflow
    
    I am on Mac, I see possible relevant discussion here on TF2.1 not on conda: https://github.com/tensorflow/tensorflow/issues/35754
    
    Reply
    - Jason Brownlee April 17, 2020 at 1:31 pm #
      
      Well done!
      
      I use macports myself:
      https://machinelearningmastery.com/install-python-3-environment-mac-os-x-machine-learning-deep-learning/
      
      Reply
meryem April 17, 2020 at 1:25 am #

Thank you Jason for the tutoriel.I applied your example to mine by adding dropout and standarisation of X

X = dataset[:, 0:7]
y = dataset[:, 7]

scaler = MinMaxScaler(feature_range=(0, 1))
X = scaler.fit_transform(X)
# define the keras model
model = Sequential()
model.add(Dense(6, input_dim=7, activation=’relu’))
model.add(Dropout(rate=0.3))
model.add(Dense(6, activation=’relu’))
model.add(Dropout(rate=0.3))
model.add(Dense(1, activation=’sigmoid’))
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’]
history=model.fit(X, y, epochs=30, batch_size=30, validation_split=0.1)
_, accuracy = model.evaluate(X, y)
print(‘Accuracy: %.2f’ % (accuracy*100))

shows me an accuracy of 100 which is not normal. to adjust my model, what should I do?

Reply
- Jason Brownlee April 17, 2020 at 6:22 am #
  
  Well done!
  
  Perhaps evaluate your model using k-fold cross validation.
  
  Reply
meryem April 17, 2020 at 7:29 am #

yes i followed your example using k-flod cross validation it gives me always 100%

if i move standarisation he gives 83% ,can you guide me please

seed = 4
numpy.random.seed(seed)
dataset = loadtxt(‘data.csv’, delimiter=’,’)
X = dataset[:, 0:7]
Y = dataset[:, 7]
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(X)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
cvscores = []
for train, test in kfold.split(X,Y):
model = Sequential()
model.add(Dense(12, input_dim=7, activation=”relu”))
model.add(Dropout(rate=0.2))
model.add(Dense(6, activation=”relu”))
model.add(Dropout(rate=0.2))
model.add(Dense(1, activation=”sigmoid”))
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
model.fit(X[train], Y[train], epochs=20, batch_size=10, verbose=1)
scores = model.evaluate(X[test], Y[test], verbose=0)
print(“%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
print(“%.2f%% (+/- %.2f%%)” % (numpy.mean(cvscores), numpy.std(cvscores)))

Reply
- Jason Brownlee April 17, 2020 at 7:48 am #
  
  Nice work! Perhaps your prediction task is trivial?
  
  Reply
meryem April 17, 2020 at 8:08 am #

you are very helpful .
or because I don’t have enough data.So there is nothing else I can use?

Reply
- Jason Brownlee April 17, 2020 at 1:28 pm #
  
  Perhaps.
  
  Reply
Farjad Haider April 17, 2020 at 11:00 pm #

Sir Jason you are awesome! Such a nice and easy to comprehend the tutorial. Great Work!

Reply
- Jason Brownlee April 18, 2020 at 5:57 am #
  
  Thanks!
  
  Reply
Joan Estrada April 19, 2020 at 3:51 am #

“Note, the most confusing thing here is that the shape of the input to the model is defined as an argument on the first hidden layer. This means that the line of code that adds the first Dense layer is doing 2 things, defining the input or visible layer and the first hidden layer.”

Could you better explain this? Thanks, nice work!

Reply
- Jason Brownlee April 19, 2020 at 6:02 am #
  
  Yes, see this:
  https://machinelearningmastery.com/faq/single-faq/how-do-you-define-the-input-layer-in-keras
  
  Reply
Hany April 19, 2020 at 9:57 am #

Actually, I cannot thank you enough Dr. Brownlee.

God Bless you.

Reply
- Jason Brownlee April 19, 2020 at 1:14 pm #
  
  Thanks. You’re very welcome!
  
  Reply
Rahim April 22, 2020 at 5:56 am #

Dear Jason
Thanks for this interesting code. I tested this code on pima-indians-diabetes in my computer with keras 2.3.1 but strangely I got the accuracy of 52%. I wonder why there is this much difference between your accuracy (76%) and mine (52%).

Reply
- Jason Brownlee April 22, 2020 at 6:10 am #
  
  You’re welcome.
  
  Perhaps try running the example a few times?
  
  Reply
Sarmad April 24, 2020 at 8:04 pm #

want to ask: in the first layer(a hidden layer) as we defined input_dim=8 w.r.t features we have right. and we specify neurons = 12. but concerned is that a thing i studied is that we specify neurons w.r.t to inputs(features) . Means if we have 8 inputs so neurons will also be 8. but you specified as 12. Why?
2) In any of problem we have to specified a neural network right. it can be any eg: convolutional, recurrent etc. so which neural network we have choose here. and where?
3) we have to assign weights. so where we have assigned?
please let me know. Thanks sir.

Reply
- Jason Brownlee April 25, 2020 at 6:44 am #
  
  The first line of the model defines 2 things, the input or visible layer (8) and the first hidden layer (12). More here:
  https://machinelearningmastery.com/faq/single-faq/how-do-you-define-the-input-layer-in-keras
  
  These two things can have different values, they are not directly related.
  
  Yes, this will help you choose models:
  https://machinelearningmastery.com/when-to-use-mlp-cnn-and-rnn-neural-networks/
  
  Weights are assigned small random numbers automatically when you call compile():
  https://machinelearningmastery.com/why-initialize-a-neural-network-with-random-weights/
  
  Reply
  - Sarmad April 26, 2020 at 7:44 pm #
    
    sir still confuse that as in ML algorithm we specify which algorithm to implement wrt to scenario like for regression we can choose linear regression , logistic regression etc.
    now at this time what neural net we have chosen? convoltiona, rntn etc?
    
    Reply
    - Jason Brownlee April 27, 2020 at 5:33 am #
      
      Linear regression is for regression, logistic regression is for classification.
      
      Here are some regression algorithms to try on a regression task:
      https://machinelearningmastery.com/spot-check-regression-machine-learning-algorithms-python-scikit-learn/
      
      Reply
Sarmad April 24, 2020 at 8:31 pm #

where are the weights, bias and input values?

Reply
- Jason Brownlee April 25, 2020 at 6:46 am #
  
  Weights are initialized to small random values when we call compile().
  
  Reply
mouna April 26, 2020 at 8:51 pm #

Hello Jason,

Congratulations fro all the good job, i want to ask you:
How we can know of all epochs the average of training time and validation time for a model?

Reply
- Jason Brownlee April 27, 2020 at 5:34 am #
  
  You could extrapolate the time of one epoch to the number of epochs you want to train.
  
  Reply
Jason Chia April 28, 2020 at 2:41 pm #

Hi Jason,
I am very new to deep learning. I understand that you do model.fit to fit the data and model.predict to predict the values of the class variable y. However, is it also possible to extract the parameter estimate and derive f(X) = y (similar to regression)?

Reply
- Jason Brownlee April 29, 2020 at 6:15 am #
  
  Perhaps for small models, but it would be a mess with thousands of coefficients. The model is complex circuit.
  
  Reply
Dina April 28, 2020 at 4:34 pm #

Hi JAson, do you have an idea on how to predict price or range of value?

Reply
- Dina April 28, 2020 at 4:39 pm #
  
  If I use keras model to predict price/range of value, it is possible for me to find the accuracy of keras model?because in your article only to predict the binary output
  
  Reply
  - Jason Brownlee April 29, 2020 at 6:19 am #
    
    You are describing a regression problem, I recommend starting here:
    https://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/
    
    Reply
- Jason Brownlee April 29, 2020 at 6:17 am #
  
  A prediction range is called a prediction interval, learn more here:
  https://machinelearningmastery.com/prediction-intervals-for-machine-learning/
  
  Reply
Hume May 5, 2020 at 10:54 am #

thank you for your explanation, i am a beginner for machine learning as well as python.woluld you please help me in getting the exact CSV data file for predicting the Hepatitis B virus.

Reply
- Jason Brownlee May 5, 2020 at 1:37 pm #
  
  This will help you locate a dataset:
  https://machinelearningmastery.com/faq/single-faq/where-can-i-get-a-dataset-on-___
  
  Reply
Ababou Nabil May 12, 2020 at 2:01 pm #

768/768 [==============================] – 2s 3ms/step
Accuracy: 76.56

Reply
- Jason Brownlee May 13, 2020 at 6:21 am #
  
  Well done!
  
  Reply
MAHESH MADHUSHAN May 24, 2020 at 11:29 am #

Why didn’t you normalize data? Is not that necessary ? I have seen on some tutorials, they normalize data for common scale using as –>from sklearn.preprocessing import StandardScaler . What is the difference that method and your method?

Reply
- Jason Brownlee May 25, 2020 at 5:43 am #
  
  It can help for some algorithms to normalize or standardize the data input data. Perhaps try it and see.
  
  Reply
Henry Levkine May 26, 2020 at 7:49 am #

Jason,

You are the best!

My name for your program here is “helloDL.py”

I am sure your future book “Hello Deep Learning” will be the most popular on the market.

People need in programs

helloClassification.py
helloRegression.py
helloHelloPrediction.py
helloDogsCats.py
helloFaces.py

and so on!

Thank you for your hard work!

Reply
- Jason Brownlee May 26, 2020 at 1:19 pm #
  
  Thanks.
  
  You can find all of these on the blog, use the search.
  
  Reply
Thijs June 12, 2020 at 12:46 am #

Hello,

is there a possibility to access the accuracy of the last epoch? If yes, how can i access this and save it?

Kind regards

Reply
- Jason Brownlee June 12, 2020 at 6:14 am #
  
  Yes, the history object contains the scores calculated on each epoch:
  https://machinelearningmastery.com/display-deep-learning-model-training-history-in-keras/
  
  Reply
Krishan June 16, 2020 at 11:08 am #

Accuracy: 82.42
epochs=1500
batch_size=1

I don’t know if what I did was appropriate. Any advise is appreciated.

Reply
- Jason Brownlee June 16, 2020 at 1:39 pm #
  
  Well done!
  
  Reply
Saad June 19, 2020 at 9:20 pm #

Hi Jason,

Thanks a lot for this wonderful learning platform.

Why were 12 neurons used in the first hidden layer, what is the criteria behind it? Is it random or there is an underlying reason/calculation?

(I presumed that the number of neurons in a hidden layer would always be between the number of inputs and the number of outputs)

Reply
- Jason Brownlee June 20, 2020 at 6:12 am #
  
  I chose the configuration after a little trial and error.
  
  There is no good theory for configuring neural nets:
  https://machinelearningmastery.com/faq/single-faq/how-many-layers-and-nodes-do-i-need-in-my-neural-network
  
  Reply
Paras Memon July 30, 2020 at 9:05 am #

Hello Jason,

I have this shape of training and testing data sets:
xTrain_CN.shape, yTrain_CN.shape, xTest_CN.shape
((320, 56, 6251), (320,), (80, 56, 6251))

I am getting this error: ValueError: Error when checking input: expected dense_20_input to have 2 dimensions, but got array with shape (320, 56, 6251)

Below is the code:

def nn_keras(xTrain_CN, yTrain_CN, xTest_CN):

model = Sequential()
model.add(Dense(12, input_dim=6251, activation=’relu’))
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))
# compile the keras model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# fit the keras model on the dataset
model.fit(xTrain_CN, yTrain_CN, epochs=150, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(xTrain_CN, yTrain_CN)
print(‘Training Accuracy: %.2f’ % (accuracy*100))

_, accuracy = model.evaluate(xTrain_CN, yTrain_CN)
print(‘Testing Accuracy: %.2f’ % (accuracy*100))

nn_keras(xTrain_CN, yTrain_CN, xTest_CN)

Reply
- Jason Brownlee July 30, 2020 at 1:44 pm #
  
  A MLP must take 2d data as input (rows and columns) and 1d data as output during training.
  
  Reply
Joanne August 12, 2020 at 1:25 am #

Hi Jason,

This is a great tutorial, very easy to understand!! Is there a tutorial for how to add weight and bias into our model?

Reply
- Jason Brownlee August 12, 2020 at 6:11 am #
  
  Thanks!
  
  Reply
Luis Cordero August 20, 2020 at 12:05 pm #

Hello, if I have a prediction problem, it is absolutely necessary to scale the input variables to use the sigmoid or relu activation functions or the one you decide to use?

Reply
- Jason Brownlee August 20, 2020 at 1:37 pm #
  
  No, but try it and compare results.
  
  Reply
Luis Cordero August 20, 2020 at 1:15 pm #

how I can create a configuration that has more than one output, i.e. the output layer has 2 or more values

Reply
- Jason Brownlee August 20, 2020 at 1:39 pm #
  
  Yes, just specify the number of targets in the output layer and prepare your training data accordingly.
  
  I have a tutorial on exactly this written and scheduled – for next week I think.
  
  Reply
  - Luis Cordero September 1, 2020 at 4:29 pm #
    
    what will been name of tutorial to find it
    
    Reply
    - Jason Brownlee September 2, 2020 at 6:24 am #
      
      Right here:
      https://machinelearningmastery.com/deep-learning-models-for-multi-output-regression/
      
      Reply
Simon Suarez August 30, 2020 at 8:27 am #

Hi Jason.

I thank you for the great quality of this article. I am experienced with Machine Learning using Scikit-Learn, and reading this post (and some of your previous on the topic) helped me a lot to get into making Multilayer Perceptrons.
I tested the knowledge I learned here with the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. I got around 92.965% Accuracy for train and 96.491% for test, only using 3 features (radius, texture, smoothness) and the following topology:
• Epochs = 250
• Batch_size = 60
• Función de activación = ReLu
• Optimizador = ‘Nadam’

Layer; Number of neurons; Activation function
Input; 3; None
Hidden 1; 4; ReLu
Hidden 2; 4; ReLu
Hidden 3; 2; ReLu
Output; 1; Sigmoid

Train and test were splitted using: train_test_split(X, y, test_size=0.33, random_state=42)
Thanks!

Reply
- Jason Brownlee August 31, 2020 at 5:58 am #
  
  Thanks.
  
  Well done on your results Simon!
  
  Reply
Berns Buenaobra September 7, 2020 at 7:32 am #

0s 833us/step – loss: 0.4607 – accuracy: 0.7773

Reply
- Jason Brownlee September 7, 2020 at 8:36 am #
  
  Well done!
  
  Reply
Berns Buenaobra September 7, 2020 at 7:37 am #

Second iteration with laptop GPU gives:
0s 958us/step – loss: 0.4119 – accuracy: 0.8216
Accuracy: 82.16

Reply
Ahmed Nuru September 8, 2020 at 5:01 pm #

Hi janson how can predict image forgery and genuine using pretrained deep-learning model

Reply
- Jason Brownlee September 9, 2020 at 6:44 am #
  
  Perhaps prepare a dataset of real and fake images and train a binary classification model to differentiate the two.
  
  Perhaps this tutorial will help you to get started:
  https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-network-to-classify-photos-of-dogs-and-cats/
  
  Reply
Fatma Zohra September 11, 2020 at 2:30 am #

Hello Jason ,

Can you please guide me how to make a query and a document as an input in our NN (knowing that they both are represented by frequency vectors ) ?

Reply
- Jason Brownlee September 11, 2020 at 6:01 am #
  
  Perhaps start here:
  https://machinelearningmastery.com/start-here/#nlp
  
  Reply
fatma zohra September 13, 2020 at 2:41 am #

Hi Dr Jason,

Thanks a lot for the reply , the link was useful for me ,
yet i’am still lost a bit since i’am new dealing with NN, actualy i want to calculate the similarity between the query and the doc using the NN , the inputs are (the TF vector of the doc and TF vector of the query , and the output is the similarity (0 if no , 1 if yes ) , i have the idea of my NN but i don’t know from where to start…
i would be gratful if you could help me (a similar code that i can take as exemple maybe ),

Waiting for your reply..thanks in advance

Reply
- Jason Brownlee September 13, 2020 at 6:10 am #
  
  I think you’re asking about calculating text similarity. If so, sorry I don’t have tutorials on that topic.
  
  Reply
  - fatma zohra September 13, 2020 at 6:38 am #
    
    yeah , this is what i was asking for , anyways thanks a lot for your tutorials they are very clear and fruitful..
    
    Reply
    - Jason Brownlee September 13, 2020 at 8:28 am #
      
      You’re welcome.
      
      Reply
yibrah fisseha September 22, 2020 at 11:41 pm #

I would like to thank you a lot for your tutorials. can you please guide me on how to evaluate the model using confusion matrix parameters such as recall, precision, f1 score?

Reply
- Jason Brownlee September 23, 2020 at 6:40 am #
  
  Yes, here are examples:
  https://machinelearningmastery.com/how-to-calculate-precision-recall-f1-and-more-for-deep-learning-models/
  
  Reply
derya September 23, 2020 at 5:03 am #

great tutorial helped a lot !

Reply
- Jason Brownlee September 23, 2020 at 6:44 am #
  
  Thanks!
  
  Reply
Sean H. Kelley September 23, 2020 at 6:38 am #

Hi Jason, thank you very much for this.

I appreciate the extra in depth explanations in the links to other pages.

I am wondering how to keep the state of mind. Like you train it while it runs and get a level of accuracy. If you finally get the level of accuracy from training a certain configuration, how do you keep that configuration/state of mind/level of accuracy of the artificial neural net without having to train it all over again?

Can you store a snapshot of that “state of mind” somewhere so that when you have a good working model, you just use that to run new data against or am I still missing some key elements in my attempting to grasp this?

Thank you!

Reply
- Jason Brownlee September 23, 2020 at 6:46 am #
  
  You can save your model and load it later to make predictions, see this tutorial:
  https://machinelearningmastery.com/save-load-machine-learning-models-python-scikit-learn/
  
  Reply
  - Sean H. Kelley September 24, 2020 at 12:53 am #
    
    Thank you very much!
    
    Reply
    - Jason Brownlee September 24, 2020 at 6:16 am #
      
      You’re welcome.
      
      Reply
Muhammad Asad Arshed October 10, 2020 at 12:34 am #

Awesome blog and technical skill would you like to refer me to some other blogs.

Reply
- Jason Brownlee October 10, 2020 at 7:06 am #
  
  Thanks!
  
  Reply
Brijesh October 10, 2020 at 5:57 pm #

Hi

Can we use only CSV file format?

Reply
- Jason Brownlee October 11, 2020 at 6:44 am #
  
  No, deep learning can use images, text data, audio data, almost anything that can be represented with numbers.
  
  Reply
imene October 18, 2020 at 4:49 am #

with epoch =10000 and batch-size = 20 a got accuracy = 84% and loss =loss: 0.3434

Reply
- Jason Brownlee October 18, 2020 at 6:12 am #
  
  Well done!
  
  Reply
- YAŞAR SAİD DERDİMAN December 27, 2020 at 4:12 pm #
  
  this is good but probably, your model’s generalization error is higher. Because more epoch means more overfitting, Therefore you should use less epoch for any deep learning training.
  
  Reply
  - Jason Brownlee December 28, 2020 at 5:58 am #
    
    Good advice.
    
    Reply
imene October 18, 2020 at 4:59 am #

first thanks for your good explanation,
how can i save the trained model to be used for test becaus the trainnig repeat each time i try to execute the program
tanks.

Reply
- Jason Brownlee October 18, 2020 at 6:12 am #
  
  Good question, this will show you how:
  https://machinelearningmastery.com/save-load-keras-deep-learning-models/
  
  Reply
Fatima October 24, 2020 at 5:18 am #

Hi Jason, I applied the Deep Neural Network algorithm(DNN) to do the prediction, It works and it is perfect, I have a problem in evaluating the predicted results I used (metrics.confusion_matrix), It gave me this error:
ValueError: Classification metrics can’t handle a mix of binary and continuous targets

any suggestions to solve the error?
note: my class label (outcome variable) is binary (0,1)

Thanks in advanced

Reply
- Jason Brownlee October 24, 2020 at 7:12 am #
  
  See this tutorial:
  https://machinelearningmastery.com/how-to-calculate-precision-recall-f1-and-more-for-deep-learning-models/
  
  Reply
K Al October 27, 2020 at 2:53 am #

First of all, please allow me to thank you for this great tutorial and for your valuable time.
I wonder: you trained and evaluated the network on the same data set. Why did not it generate a 100% accuracy then?

Thanks

Reply
- Jason Brownlee October 27, 2020 at 6:46 am #
  
  All models have error.
  
  If we get perfect skill/100% accuracy then the problem is likely too simple and machine learning is not required:
  https://machinelearningmastery.com/faq/single-faq/what-does-it-mean-if-i-have-0-error-or-100-accuracy
  
  Reply
Zuzana November 1, 2020 at 11:15 pm #

Hi, great tutorial, everything works, except when trying to add predictions, I get the following error message. Could you please, help? Thanks a lot.

WARNING:tensorflow:From C:/Users/ZuzanaŠútová/Desktop/RTP new/3_training_deep_learning/data_PDS/keras_first_network_including_predictions.py:27: Sequential.predict_classes (from tensorflow.python.keras.engine.sequential) is deprecated and will be removed after 2021-01-01.
Instructions for updating:
Please use instead:* np.argmax(model.predict(x), axis=-1), if your model does multi-class classification (e.g. if it uses a softmax last-layer activation).* (model.predict(x) > 0.5).astype("int32"), if your model does binary classification (e.g. if it uses a sigmoid last-layer activation).

Warning (from warnings module):
File “C:\Users\ZuzanaŠútová\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\keras\engine\sequential.py”, line 457
return (proba > 0.5).astype(‘int32’)
RuntimeWarning: invalid value encountered in greater
Traceback (most recent call last):
File “C:\Users\ZuzanaŠútová\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py”, line 2895, in get_loc
return self._engine.get_loc(casted_key)
File “pandas\_libs\index.pyx”, line 70, in pandas._libs.index.IndexEngine.get_loc
File “pandas\_libs\index.pyx”, line 101, in pandas._libs.index.IndexEngine.get_loc
File “pandas\_libs\hashtable_class_helper.pxi”, line 1032, in pandas._libs.hashtable.Int64HashTable.get_item
File “pandas\_libs\hashtable_class_helper.pxi”, line 1039, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “C:/Users/ZuzanaŠútová/Desktop/RTP new/3_training_deep_learning/data_PDS/keras_first_network_including_predictions.py”, line 30, in
print(‘%s => %d (expected %d)’ % (X[i].tolist(), predictions[i], y[i]))
File “C:\Users\ZuzanaŠútová\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\frame.py”, line 2902, in __getitem__
indexer = self.columns.get_loc(key)
File “C:\Users\ZuzanaŠútová\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py”, line 2897, in get_loc
raise KeyError(key) from err
KeyError: 0

Reply
- Jason Brownlee November 2, 2020 at 6:40 am #
  
  Sorry to hear that, this may help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Zuzana November 2, 2020 at 6:50 am #

I am sorry but none of that helped :/

Reply
Julian A Epps November 3, 2020 at 7:58 am #

Where can I find documentation on these keras functions that you are using. I don’t know how any of these functions work.

Reply
- Jason Brownlee November 3, 2020 at 10:08 am #
  
  Good question, here:
  https://keras.io/api/
  
  Reply
Umair Rasool November 8, 2020 at 4:42 am #

Hello Sir, i am not actually familiar with ML so someone doing my task for prediction using raster dataset with python. He just giving final results and CSV file rather than final prediction map as raster, Could you please guide me ML works like this or he is missing something to generate final map. Please Response. Thanks

Reply
- Umair Rasool November 8, 2020 at 4:44 am #
  
  sorry i have a little mistake “final result as CSV file”
  
  Reply
- Jason Brownlee November 8, 2020 at 6:42 am #
  
  Perhaps this framework will help:
  https://machinelearningmastery.com/start-here/#process
  
  Reply
Halil November 27, 2020 at 6:09 am #

Thank you for this brilliantly explained tutorial ! Actually, I am bored of watching videos which have lots of boring talks and superficial explanations. I discovered my main resource now

By the way, I guess there is an error here. No?
rounded = [round(x[0]) for x in predictions] —> should be “round(X…..”

Reply
- Jason Brownlee November 27, 2020 at 6:44 am #
  
  You’re welcome.
  
  There are many ways to round an array.
  
  Reply
  - Halil November 30, 2020 at 5:45 am #
    
    I mean, that “x” should be “X”. No?
    
    Reply
RAJSHREE SRIVASTAVA November 28, 2020 at 4:05 am #

Hi jason,

Hope you are doing well. I am working on ANN for image classification in google colab. I am getting this error , can you help me to find solution for this?

InvalidArgumentError: Incompatible shapes: [100,240,240,1] vs. [100,1]
[[node gradient_tape/mean_squared_error/BroadcastGradientArgs (defined at :14) ]] [Op:__inference_train_function_11972]

Function call stack:
train_function

Waitting for your reply.

Reply
- Jason Brownlee November 28, 2020 at 6:41 am #
  
  Sorry, I don’t know about colab:
  https://machinelearningmastery.com/faq/single-faq/do-code-examples-run-on-google-colab
  
  Reply
RAJSHREE SRIVASTAVA November 28, 2020 at 8:14 pm #

Hi jason thanks for your reply.

ok in python I am working on ANN for image classification . I am getting this error , can you help me to find solution for this?

InvalidArgumentError: Incompatible shapes: [100,240,240,1] vs. [100,1]
[[node gradient_tape/mean_squared_error/BroadcastGradientArgs (defined at :14) ]] [Op:__inference_train_function_11972]

Function call stack:
train_function

Reply
- Jason Brownlee November 29, 2020 at 8:12 am #
  
  Sorry, the cause of the error is not clear, you may need to debug your model.
  
  Here are some suggestions:
  https://machinelearningmastery.com/faq/single-faq/can-you-read-review-or-debug-my-code
  
  Reply
Hanem December 17, 2020 at 11:07 am #

Thanks a million, it helped me a lot. Actually, all of your articles are informative and goog guide for me.

Reply
- Jason Brownlee December 17, 2020 at 12:59 pm #
  
  You’re welcome, I’m happy to hear that!
  
  Reply
John Smith December 28, 2020 at 7:58 am #

This was a brilliant tutorial I think what could be done to improve this is adding an example of actual predictions.

The prediction bit is quite brief I don’t quite have an understanding how to use that array of “predictions” to actually predict something.

Like if I wanted to feed it some test data and get a prediction how could I do that?

I will consult some of your other helpful guides but would be great to have it all in this 1 tutorial.

Reply
- John Smith December 28, 2020 at 8:07 am #
  
  I did not have my coffee when I wrote this.
  
  I see now we are passing the original variables back into the model and predicting and printing out the predication vs actual.
  
  🙂
  
  Thanks – you made a great tutorial!
  
  Have a good christmas and new year.
  
  Reply
  - Jason Brownlee December 28, 2020 at 8:19 am #
    
    No problem at all!
    
    I’m happy it helped you kick start your journey with deep learning.
    
    Reply
Joe January 3, 2021 at 5:00 am #

Hi Jason,

Happy new year!

You are predicting on the same data set, X, that you used to train the model.

I would have thought that the model would’ve produced close to 100% accuracy in this case since the model is so well trained specifically with respect to X (maybe even overfitted).

Why are we only getting 76.9% accuracy, not close to 100%?

Thanks
Joe

Reply
- Jason Brownlee January 3, 2021 at 6:00 am #
  
  Yes, I that to keep the example simple, I explain more here:
  https://machinelearningmastery.com/faq/single-faq/why-do-you-use-the-test-dataset-as-the-validation-dataset
  
  No model is perfect, they are all trying to generalize from the training data.
  
  Reply
Roberto Aguirre Maturana January 7, 2021 at 12:19 pm #

Excelent tutorial, well explained and very easy to follow. It seems you have to update one line that was deprecated in 2021:

#instead of
#predictions = model.predict(X)

#now you have to use
predictions = (model.predict(X) > 0.5).astype(“int32”)

Reply
- Jason Brownlee January 7, 2021 at 2:04 pm #
  
  Thanks.
  
  I don’t think so:
  https://keras.io/api/models/model_training_apis/#predict-method
  
  And:
  https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#predict
  
  If you want labels you can use model.predict_classes(), this will help:
  https://machinelearningmastery.com/how-to-make-classification-and-regression-predictions-for-deep-learning-models-in-keras/
  
  Reply
Girish Ahire January 8, 2021 at 8:27 pm #

I got 65%

Reply
- Jason Brownlee January 9, 2021 at 6:41 am #
  
  Well done!
  
  Reply
Tom Rauch January 15, 2021 at 6:37 am #

Hi, I have these installed in my VirtualEnv (along with other libraries)

Keras==2.4.3
Keras-Preprocessing==1.1.2

But when I run this:

# first neural network with keras tutorial
from numpy import loadtxt
from keras.models import Sequential
from keras.layers import Dense

I get a ‘Dead Kernel’ error message in jupyter; the first line runs fine but the ‘dead kernel’ message appears when it gets to keras.

Any idea on how to fix?

Thanks!

Reply
- Jason Brownlee January 15, 2021 at 8:46 am #
  
  I recommend not using a notebook as they cause problems for almost everyone:
  https://machinelearningmastery.com/faq/single-faq/why-dont-use-or-recommend-notebooks
  
  Instead save the code using a simple text editor like sublime or atom and run the script from the command line:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-run-a-script-from-the-command-line
  
  Reply
Tom Rauch January 15, 2021 at 9:32 am #

Thank you Jason! I will give the command line a try.

Tom

Reply
- Jason Brownlee January 15, 2021 at 11:32 am #
  
  You’re welcome.
  
  Reply
Tom Rauch January 15, 2021 at 12:22 pm #

Hi Jason, I followed your instructions but still running into issues with Keras, maybe I did not install it correctly?

(rec_engine) tom@machine:~/code$ python keras.py
Traceback (most recent call last):
File “keras.py”, line 3, in
from keras.models import Sequential
File “/home/tom/code/keras.py”, line 3, in
from keras.models import Sequential
ModuleNotFoundError: No module named ‘keras.models’; ‘keras’ is not a package

but when I run this, I do see it installed

(rec_engine) tom@machine:~/code$ pip list | grep Keras
Keras 2.4.3
Keras-Preprocessing 1.1.2

I followed the pip install found in this guide:

https://www.liquidweb.com/kb/how-to-install-keras/

I think my next step may be to create a new VirtualEnv for just Keras and TensorFlow.

Thanks, Tom

Reply
- Jason Brownlee January 15, 2021 at 1:26 pm #
  
  I think there may be an issue with your environment, perhaps this tutorial will help:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  Reply
Govind Kelkar January 15, 2021 at 10:58 pm #

Hi Dr. Jason,

I executed your code in google colab and got it executing only change I found is while predicting the new data
you had listed the sequence as 10101 and I got it as 01010
Also did the few changes to the code.
Nonetheless I got the code working at least. Now I will try and play with it to get more accuracies.

Reply
- Jason Brownlee January 16, 2021 at 6:55 am #
  
  Well done!
  
  Reply
Tom Rauch January 16, 2021 at 9:18 am #

Hi Jason, I created a new virtual env and loaded Keras, TensorFlow etc and created a .py with all of your code, then ran it at the command line in the directory that contains both the csv and py.

But, I got this error:

(ML) tom@machine:~/code$ python mykerasloader.py
Illegal instruction (core dumped)

Is there a logger I should be using to see more detail?

Thanks, Tom

Reply
- Jason Brownlee January 16, 2021 at 1:20 pm #
  
  That does not look good, I suspect there is something up with your environment.
  
  Perhaps you can try posting/searching on stackoverflow.com
  
  Reply
Francisco Santiago January 17, 2021 at 9:51 am #

Creating neural network
24/24 [==============================] – 0s 756us/step – loss: 0.3391 – accuracy: 0.8503
Accuracy: 85.03

Wo hooo!!

Reply
- Jason Brownlee January 17, 2021 at 1:27 pm #
  
  Well done!
  
  Reply
Jeremy January 17, 2021 at 4:38 pm #

Dr. Brownlee,

Good morning, sir! Curious for your thoughts on something: is there value in running the algorithm, say, fifty times and averaging the accuracy? I’ve used that technique before to good effect, but since this is relatively new to me, having an experienced teacher of machines set me straight would be helpful.

If this is something you think is useful, I have one more question that comes from my still limited understanding of things: where would I start the ‘for’ loop? My first thought was starting it before ‘model = Sequential()’, but that would mean redefining the NN structure each time, which doesn’t make much sense. Second thought was starting it before ‘model.fit()’, in which case the model stays the same, and loss/optimization functions stay the same.

Thank you very much for your time!

V/r,
Jeremy

Reply
- Jason Brownlee January 18, 2021 at 6:05 am #
  
  Yes, it reduces the variance in the method and can be used for both evaluating model performance and making predictions.
  
  More details are here:
  https://machinelearningmastery.com/faq/single-faq/why-do-i-get-different-results-each-time-i-run-the-code
  
  The loop is around the definition, training and evaluation of the model.
  
  Reply
Tom Rauch January 18, 2021 at 6:33 am #

Hi Jason, any tuts on using your code in this posting in Google colabs? Not sure how to point to the csv using colabs.

Thanks, Tom

Reply
- Jason Brownlee January 18, 2021 at 8:58 am #
  
  This is a common question that I answer here:
  https://machinelearningmastery.com/faq/single-faq/do-code-examples-run-on-google-colab
  
  Reply
Anna January 21, 2021 at 8:53 am #

Hello Jason I have a question.

I want to create a model to predict the urban development. I started with your model above.
I use the information about the urban and the non-urban points for 4 years (2000,2006,2012,2018). I also use information about the slope and some distances for every point.
I have create a dataset witch contains information in the columns like this.
2000-2006
2006-2012

After the train I have accuracy 94%
But when I give to the model the year 2006 it doesn’t predict the 2012 very well. There many problems.
I thought that with this accuracy the model would have predict the 2012 very well.

I don’t where it might be the problem… At the train section, at the predict or somewhere else??
Please tell your opinion because I am stuck in this for weeks and I have to find the solution quickly!!!!

Reply
- Jason Brownlee January 22, 2021 at 7:13 am #
  
  It sounds like your working with a time series dataset.
  
  If so, it would not be valid to train the model on the future and predict the past.
  
  I recommend starting here:
  https://machinelearningmastery.com/start-here/#deep_learning_time_series
  
  Reply
James Parker January 22, 2021 at 8:43 pm #

Thank you for this great article but I have a question what does _, before accuracy stands for
I searched it on the internet but couldn’t find it

Reply
- Jason Brownlee January 23, 2021 at 7:04 am #
  
  We use underscore (_) in python to eat up return values or variables we don’t care about. In this case the loss, as we only care about accuracy.
  
  Reply
FOGANG FOKOA January 24, 2021 at 12:43 pm #

Hello,

Input an array of (50385, ) where each is an array of (x, 127) into MLP)

I want to input a numpy 2d array into MLP but I have an array of 50395 rows that contains many 2d array of shape (x, 129). x because some matrices have different row numbers. Here is an example :

train[‘spec’].shape
>>(50395,)
train[‘spec’][0].shape
>>(41, 129)
train[‘spec’][5].shape
>>(71, 129)

Here an snippet of my code :

X_train = train[‘spec’].values; X_valid = valid[‘spec’].values
y_train = train[‘label’].values; y_valid = valid[‘label’].values
model.add(Dense(12, input_shape=(50395, ), activation=’relu’));
model.fit(X_train, y_train, validation_data=(X_valid, y_valid), epochs=500, batch_size=1);

I get this error on last line (model.fit) :
ValueError: Error when checking input: expected dense_54_input to have shape (50395,) but got array with shape (1,)

How to fix this problem so that the network can take as input all 50395 matrices of shape (x, 129)?

Reply
- Jason Brownlee January 24, 2021 at 12:52 pm #
  
  Perhaps they are “time steps” and if so this may help:
  https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-timesteps-and-features-for-lstm-input
  
  And then pad all sequences to the same length:
  https://machinelearningmastery.com/data-preparation-variable-length-input-sequences-sequence-prediction/
  
  Reply
  - FOGANG FOKOA January 24, 2021 at 1:40 pm #
    
    In fact I absolutely must use an MLP. I had sounds of 1s of frequency16000hz. As a result, all of my audio gave me an array of 16000. After removing the silence in those audios, I ended up with arrays of different sizes.
    
    Then I transformed these audio into a numpy matrix of numbers using the spectrograme algorithm to input them to the neural network.
    
    I ended up with matrices of 2 dimensions of the same columns but of different rows.
    
    is it possible to pass them in knowing that the matrix have different sizes?
    
    Reply
    - Jason Brownlee January 25, 2021 at 5:47 am #
      
      As a first step, perhaps try padding all inputs to the same size and use a masking input layer followed by dense/mlp architecture.
      
      Reply
FOGANG FOKOA January 28, 2021 at 12:56 am #

I did as you advised me. And I passed this difficulty there! Now my code looks like this

model = Sequential();

model.add(Dense(units =8, input_shape=(71, 129), activation=’relu’));
model.add(Dense(units=8, activation=’relu’));
model.add(Dense(units=11, activation=’sigmoid’));

# Compile model
model.compile(loss=’categorical_crossentropy’, optimizer=’sgd’, metrics=[‘accuracy’]);
#model = mpl_model();
X_train = list(train_df[‘spec’]); X_valid = list(valid_df[‘spec’]);
y_train = train_df[‘label’]; y_valid = valid_df[‘label’];

#labels = [‘yes’, ‘no’, ‘up’, ‘down’, ‘left’,’right’, ‘on’, ‘off’, ‘stop’, ‘go’];
encoder = LabelEncoder();
encoder.fit(y_train);
encoded_y_train = encoder.transform(y_train);

dummy_y_train = to_categorical(encoded_y_train);

# Fit model , validation_data=(np.array(X_valid), y_valid)
model.fit(np.array(X_train), np.array(list(dummy_y_train)), epochs=50, batch_size=50);

and I get this error :

ValueError: A target array with shape (50395, 11) was passed for an output of shape (None, 71, 11) while using as loss categorical_crossentropy. This loss expects targets to have the same shape as the output.

Reply
- Jason Brownlee January 28, 2021 at 6:01 am #
  
  Ouch, looks like the shape of the data does not match the expectations of the model.
  
  Perhaps focus on the prepared data and inspect it after each change – get that right, then focus on the modeling part.
  
  Reply
  - FOGANG FOKOA January 29, 2021 at 7:28 am #
    
    Okay. It’s done and It works well.. thank you
    
    Reply
    - Jason Brownlee January 29, 2021 at 7:40 am #
      
      Nice work!
      
      Reply
Kinson VERNET January 29, 2021 at 1:06 am #

Hello, thank you for this tutorial.

For 100 times I got score = 76.82 for the accuracy.

Reply
- Jason Brownlee January 29, 2021 at 6:06 am #
  
  Well done!
  
  Reply
Kamal January 30, 2021 at 12:14 pm #

It’s a superb tutorial to implement your first deep neural network in Python. Thank you, dear Jason Brownlee.

Reply
- Jason Brownlee January 30, 2021 at 12:35 pm #
  
  Thanks, well done on your progress!
  
  Reply
Rob February 18, 2021 at 1:59 pm #

Hi there,
I’m currently stuck on fitting the model. Only thing I have done differently is use read_csv so I didn’t have to put anything locally. But I’ve validated the X/y outputs to be the same.

My error is:

ValueError: logits and labels must have the same shape ((None, 11) vs (None, 1))

Reply
- Jason Brownlee February 19, 2021 at 5:53 am #
  
  It suggests your data was not loaded correctly, perhaps this will help:
  https://machinelearningmastery.com/load-machine-learning-data-python/
  
  Reply
  - Rob March 1, 2021 at 11:59 pm #
    
    Ah thanks, it turns out it was an issue with the wrong number of nodes on the sigmoid layer.
    
    Reply
    - Jason Brownlee March 2, 2021 at 5:45 am #
      
      Happy to hear you solved your problem!
      
      Reply
Sofia February 24, 2021 at 3:53 am #

Another great tutorial!!

When I run the program it crashes with an error as seen below:

2021-02-23 18:50:50.497125: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘cudart64_101.dll’; dlerror: cudart64_101.dll not found
2021-02-23 18:50:50.498601: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File “C:/Users/USER/PycharmProjects/Sofia/main.py”, line 26, in
X = dataset[:,0:8]
File “C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\frame.py”, line 3024, in __getitem__
indexer = self.columns.get_loc(key)
File “C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\indexes\base.py”, line 3080, in get_loc
return self._engine.get_loc(casted_key)
File “pandas\_libs\index.pyx”, line 70, in pandas._libs.index.IndexEngine.get_loc
File “pandas\_libs\index.pyx”, line 75, in pandas._libs.index.IndexEngine.get_loc
TypeError: ‘(slice(None, None, None), slice(0, 8, None))’ is an invalid key

How would I go about fixing this error? Thank you in advance!

Reply
- Jason Brownlee February 24, 2021 at 5:38 am #
  
  Thanks!
  
  Sorry to hear that, perhaps these tips will help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Slava February 27, 2021 at 3:46 am #

It looks like the model.predict_classes() was deprecated on 2021-01-01.
Cheers,
Slava

Reply
- Jason Brownlee February 27, 2021 at 6:09 am #
  
  Thanks.
  
  Reply
- Atsushi Isobe March 3, 2021 at 11:26 pm #
  
  What is the new method to use? I can not run the predict method after finishing the training.
  
  Reply
  - Jason Brownlee March 4, 2021 at 5:50 am #
    
    This will help:
    https://machinelearningmastery.com/how-to-make-classification-and-regression-predictions-for-deep-learning-models-in-keras/
    
    Reply
Mitchell March 11, 2021 at 8:16 am #

Jason, I have a couple of questions regarding the layers and how they choose filters.

model = Sequential()
model.add(Dense(12, input_dim=8, activation=’relu’))
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’)

1)What is the filter size for each layer above ? 3×3 or 7×7.
2) Are there any pre-defined 3×3 filters, 7×7 filers,?
3) In hidden layers, filters are used to produce next layer usually. How does the model choose filters? For example, if a layer has 16 nodes, and how would I choose 32 filters so that the next layer will have 32 nodes (neurons) ?

When you create a model, do you need to specify filters for each layer needed? like size of a filter and how many filters. .

Thanks!

Reply
- Jason Brownlee March 11, 2021 at 1:25 pm #
  
  There are no filters in a Dense layer, filters is something to do with convolutional layers:
  https://machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/
  
  Reply
marineboy March 12, 2021 at 8:22 pm #

hello Jason
i have a problem ! can u have me :

when I predict_classes(Z) #Z=[100,100,100,100,100,100,100,100] as you see this data so difference but output still 0 or 1. i want output = don’t know label :((((( how can i make it pls have me

thanks you so much, sir

Reply
- Jason Brownlee March 13, 2021 at 5:29 am #
  
  Sorry, I don’t understand.
  
  Perhaps you can rephrase the problem you’re having?
  
  Reply
Franklin March 17, 2021 at 3:00 pm #

It’s an awesome blog. Keep the good work.

Reply
- Jason Brownlee March 18, 2021 at 5:15 am #
  
  Thanks!
  
  Reply
Hamza March 19, 2021 at 12:38 am #

79.53 accuracy

Reply
- Jason Brownlee March 19, 2021 at 6:23 am #
  
  Well done!
  
  Reply
Oriyomi Raheem March 20, 2021 at 6:06 am #

I am trying to train a permeability data in las file and predict them afterwards. Please help

Reply
- Jason Brownlee March 21, 2021 at 6:00 am #
  
  Perhaps this process will help you to work through your project:
  https://machinelearningmastery.com/start-here/#process
  
  Reply
Bangash 李忠勇 March 31, 2021 at 6:41 pm #

accuracy: 0.7865
Accuracy: 78.65

Reply
- Jason Brownlee April 1, 2021 at 8:08 am #
  
  Well done!
  
  Reply
Pankaj April 23, 2021 at 7:19 am #

With categorical features, how would I prevent a Keras model from making a prediction on test samples that it has not seen in the training set, and instead either use another model or throw an exception?

Reply
- Jason Brownlee April 24, 2021 at 5:13 am #
  
  Sorry, I don’t understand. Perhaps you can elaborate?
  
  Reply
Luca April 26, 2021 at 8:31 pm #

All the content you create and offer is absolutely amazing.
Very informative, very up-to-date and cristal-clear.

THANK YOU!

Reply
- Jason Brownlee April 27, 2021 at 5:16 am #
  
  You’re welcome.
  
  Reply
Ronald Ssebadduka May 5, 2021 at 4:53 pm #

File “/Users/ronaldssebadduka/PycharmProjects/pythonProject1/venv/lib/python3.9/site-packages/numpy/lib/npyio.py”, line 1067, in read_data
items = [conv(val) for (conv, val) in zip(converters, vals)]
File “/Users/ronaldssebadduka/PycharmProjects/pythonProject1/venv/lib/python3.9/site-packages/numpy/lib/npyio.py”, line 1067, in
items = [conv(val) for (conv, val) in zip(converters, vals)]
File “/Users/ronaldssebadduka/PycharmProjects/pythonProject1/venv/lib/python3.9/site-packages/numpy/lib/npyio.py”, line 763, in floatconv
return float(x)
ValueError: could not convert string to float: ‘\ufeff”6’

I ˆget this error when i run your code!
How can I fix it?

Reply
- Jason Brownlee May 6, 2021 at 5:42 am #
  
  Sorry to hear that, perhaps some of these tips will help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Shilpa May 28, 2021 at 4:43 am #

Contents are explained in a simple way and are so clear. Thanx Jason

Reply
- Jason Brownlee May 28, 2021 at 6:49 am #
  
  You’re welcome.
  
  Reply
Toni Nehme May 28, 2021 at 7:56 pm #

Please please help me to build a Multilayer Perceptron to use it for regression problem. Thank you

Reply
- Jason Brownlee May 29, 2021 at 6:50 am #
  
  Sure, see this:
  https://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/
  
  Reply
James Mayr May 29, 2021 at 11:02 pm #

Thank you sooo much for your tutorial! I struggled around with the input layer and the Keras help was not helpful. But your explanation gave me the insight and the things became total clear! That was very great, Thank you!

Reply
- Jason Brownlee May 30, 2021 at 5:50 am #
  
  You’re welcome!
  
  Reply
Meenakshi June 3, 2021 at 8:28 pm #

Great work Sir. Simple, detailed explanation of complex things.
I would like to learn modelling for DDoS attacks detection in Neural networks. Please suggest the way.
Tanks in advance.

Reply
- Jason Brownlee June 4, 2021 at 6:48 am #
  
  Perhaps the tutorials here will help if you are modeling your problem as a time series:
  https://machinelearningmastery.com/start-here/#deep_learning_time_series
  
  Reply
Meenakshi June 5, 2021 at 11:34 pm #

Thank you very much. I will go through it Sir.

Reply
- Jason Brownlee June 6, 2021 at 5:51 am #
  
  You’re welcome.
  
  Reply
JC June 24, 2021 at 4:13 am #

The following are the outcome of the first 10 consecutive executions on my 8GB RAM 64bit Windows 10 platform:

Accuracy: 65.49
Accuracy: 70.70
Accuracy: 75.91
Accuracy: 76.04
Accuracy: 78.26
Accuracy: 76.04
Accuracy: 77.86
Accuracy: 79.17
Accuracy: 78.52
Accuracy: 78.91

The computer does not have GPU. The script gives some warning messages. One of them is: “None of the MLIR Optimization Passes are enabled (registered 2)”

Reply
- Jason Brownlee June 24, 2021 at 6:06 am #
  
  Well done!
  
  Reply
Sneha July 2, 2021 at 8:31 am #

Hi,

I have a question regarding the input amount. I am attempting to fit a neural network for a classification model. However, the features in my model are categorical so I need to one-hot encode them. For instance, if a categorical variable has 3 values and I one-hot encode it, would that make ‘input_dim’ 1 or 3?

Reply
- Jason Brownlee July 3, 2021 at 6:05 am #
  
  Yes, categorical variables will need to be encoded.
  
  3 categories will become 3 binary input variables when using a one hot encoding.
  
  Reply
Rohan July 3, 2021 at 10:15 am #

My results:
Accuracy:75.78
Accuracy:78.26
Accuracy:76.30
Accuracy:77.47
Accuracy:77.47

Reply
- Jason Brownlee July 4, 2021 at 5:58 am #
  
  Well done!
  
  Reply
Patrick July 10, 2021 at 8:32 pm #

Hi Jason,

Thank you for all of your content. All very insightful for someone new to Keras and machine learning. If you could offer any guidance/insight into the below problem I’m trying to tackle, then it would be much appreciated.

I am trying to replicate a similar Ball Prediction Model as discussed here:

https://towardsdatascience.com/predicting-t20-cricket-matches-with-a-ball-simulation-model-1e9cae5dea22

This is a multiclassifcation problem (thank you for your article on this). There are 8 outputs that I am trying to predict (0, 1, 2, 3, 4, 6, Wide, Wicket) column H in my dataset (https://i.stack.imgur.com/DmTNb.png).

This dataset is ball-by-ball (match) data of many cricket matches. Columns A-G are the input variables that should be used to predict the probability of each outcome (innings, over, batsman, bowler etc.)

Model:

X = my_data[:,0:7]
y = my_data[:,7]

model = Sequential()
model.add(Dense(12, input_dim=7, activation=’relu’))
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))

model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
model.fit(X, y, epochs=150, batch_size=10, verbose=0)
_, accuracy = model.evaluate(X, y, verbose=0)
print(‘Accuracy: %.2f’ % (accuracy*100))

Running the above model on the ball-by-ball dataset gives an accuracy of 30%. As the article suggests, I want to include more data i.e. the historical probability of each individual batsman and bowler achieving each of the 8 outcomes.

This means I have 3 datasets which should be used to influence the probability of each outcome.

How and when should I be trying to introduce these 3 linked datasets? I presumably want the model to consider all this information at the same time and not in isolation.

Is it a case of trying to incorporate the batsman/bowler datasets into the match-by-match data? The only issue I have with this is that there are c. 200,000 rows of match data, whereas a player database will have c. 500 rows.

Maybe I am wrong, and I should be running the multiple datasets through the model individually and then somehow pooling the outcomes – is this even possible? Although I doubt that this is even recommended/worthwhile

If you have any suggestions on how to improve the above, or achieve the desired outcome, then it would be most welcomed.

Thanks again for all your hard work in maintaining a great data science site.

Reply
- Jason Brownlee July 11, 2021 at 5:39 am #
  
  Defining the data/problem for a model is the real work in applied machine learning.
  
  There is no good/best way, I recommend reading papers on or related to the topic to get ideas, prototype, experiment, etc.
  
  Also, this may also help on defining the problem:
  https://machinelearningmastery.com/how-to-define-your-machine-learning-problem/
  
  Also, more generally, these tutorials explain how to get better performance from neural nets:
  https://machinelearningmastery.com/start-here/#better
  
  Reply
Jolene Wang July 23, 2021 at 5:08 am #

Hi Jason!

Thank you for providing all of this content. I am trying to replicate this model by using my own csv file however it contains many NaN and thus can not be loaded through the loadtxt() function. As 0 is a very important number in my dataset, I cannot change my NAs to 0. What can I do?

Thank you again for all of your help.

Reply
- Jason Brownlee July 23, 2021 at 6:04 am #
  
  You must impute the missing values first, there are many methods:
  https://machinelearningmastery.com/?s=missing&post_type=post&submit=Search
  
  Reply
Jolene Wang July 23, 2021 at 5:13 am #

I forgot to mention but is there a way for me to keep the NaN in the dataset and have the model read it as just a missing value? It would be difficult for me to assign the NaNs a specific value as it could mess up the dataset.

Reply
- Jason Brownlee July 23, 2021 at 6:04 am #
  
  No. NaN will cause all computation to fail in a ml model, including a neural net.
  
  Reply
Isiyaku Saleh July 31, 2021 at 10:09 am #

Thank very much Dr, Jason the tutorial has really served be well.

Reply
- Jason Brownlee August 1, 2021 at 4:49 am #
  
  You’re welcome!
  
  Reply
Tim Papa August 3, 2021 at 8:02 pm #

This tutorial builds a neural network, but what specifically this neural network is? Is it an ANN or CNN or RNN?

Reply
- Jason Brownlee August 4, 2021 at 5:13 am #
  
  It is a multi-layer perceptron (MLP) which is a type of feed-forward neural network. It is not a CNN or RNN.
  
  Reply
Edwin Brown August 13, 2021 at 7:26 am #

First and foremost, thank you Jason Brownlee for getting me started with my first deep learning project. I followed step-by-step and found myself stuck for a while; however, after countless hours of researching I found my code below to work for Python 3.8.10, Tensorflow 2.5.0, IPython 7.26.0, and Keras 2.6.0 respected environments. I apologize if I over commented, I was taking notes as I was reading through Jason’s source codes and notes. I used Anaconda-Spyder and I wanted to see the results as well in Jupyter Notebook. I hope this helps:

import sys
import tensorflow as tf
from tensorflow import keras
from numpy import loadtxt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Load the data and split the X(input) & y(output) variables
# Be sure your data is in the respected file as the project
dataset = loadtxt(r’pima-indians-diabetes.csv’, delimiter=’,’)
X = dataset[:,0:8]
y = dataset[:,8]

# Create our sequential model

# input_dim sets number of arguements for the number of input variables
# This structure as three layers
# Fully connected layers are defined by the dense class
# for more on dense class view on Keras homepage
# ReLU on the first to layers and Sigmoid function on the output layer(third layer)
# Default threshold of 0.5 and better performance from ReLU
# ReLU measures output between 0 and 1 as seen in probability
# The model expects rows of data with 8 variables (the input_dim=8 argument)
# The first hidden layer has 12 nodes and uses the relu activation function.
# The second hidden layer has 8 nodes and uses the relu activation function.
# The output layer has one node and uses the sigmoid activation function.

model = Sequential()

model.add(Dense(12, input_dim=8, kernel_initializer=’normal’, activation=’relu’))
model.add(Dense(8, kernel_initializer=’normal’, activation=’relu’))
model.add(Dense(1, kernel_initializer=’normal’, activation=’sigmoid’))

# Compile the model

model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

# Fit the model onto the dataset

# Epoch: One pass through all of the rows in the training dataset.
# Batch: One or more samples considered by the model within an epoch before weights are updated.
# The CPU or GPU handles it from here, usually, larger datasets need the GPU

model.fit(X, y, epochs=150, batch_size=10, verbose=0)

# Evaluate the data

_, accuracy = model.evaluate(X, y, verbose=0)
print(‘Accuracy: %.2f’ % (accuracy*100))

# make probability predictions with the model
predictions = model.predict(X)
# round predictions
rounded = [round(x[0]) for x in predictions]

Reply
- Adrian Tam August 14, 2021 at 2:33 am #
  
  Good work!
  
  Reply
Bonjour20 August 15, 2021 at 9:43 pm #

I use Windows system on my laptop , and I do not know if I should have a Linux destro > I am confused about where should I download the Dataset > He mentioned :” on the same place where ptyhon is installed” , what is this riddle ?
It is a riddle for a beginner like me coming from non technological background .

Reply
- Adrian Tam August 17, 2021 at 7:30 am #
  
  Usually that means, you just need to place the data files and the python code file together at the same folder.
  
  Reply
sama samaan August 30, 2021 at 6:19 am #

Hello
Thanks for this great tutorial 🙂

Question no. 1: can we apply deep learning in Apache Spark?

Question no. 2: I have the following dataset https://www.kaggle.com/leandroecomp/sdn-traffic
I tried the multi-class classification code but it stop working. What could be the reason behind that fault?

Thanks

Reply
- Adrian Tam September 1, 2021 at 7:39 am #
  
  (1) yes (2) what specifically stopped working?
  
  Reply
MALAVIKA September 23, 2021 at 11:17 pm #

First of all, I am overwhelmed by the number of comments and prompt replies by the author. You are really a lifesaver to many, Jason.

Now, I have a doubt. I have been searching for a simple feed-forward-back-propagation ANN code in python, and I could see only feed-forward neural networks everywhere. In your example, is backpropagation happening? Doesn’t ANN mean both the processes by default?

Shouldn’t we apply back propagation in ANN, normally?

Reply
- Adrian Tam September 24, 2021 at 4:41 am #
  
  Feed-forward happens when you give input to the ANN. Backpropagation happens when you calculate the gradient and update the weights in each neuron.
  
  Reply
MALAVIKA September 24, 2021 at 5:06 pm #

So, I suppose it’s (back-propagation) not happening in the above tutorial. Can you show us how to code the back-propagation in python, or direct me to any posts that show the same?

Thank You.

Reply
- Adrian Tam September 25, 2021 at 4:36 am #
  
  When you call fit() function, backpropagation is used to update the model parameters. That’s part of the training process. We don’t normally do this explicitly. If you are interested, see a toy example here: https://machinelearningmastery.com/implement-backpropagation-algorithm-scratch-python/
  
  Reply
Elham October 8, 2021 at 1:11 am #

Hi, Thanks a lot for this awesome tutorial. I’m using tensorflow version 2.6 and in making class predictions with the model with these lines of code,

predict_x = model.predict(X)
classes_x = np.argmax(predict_x,axis=1)
for i in range(5):
print(‘%s => %d (expected %d)’ % (X[i].tolist(), classes_x[i], y[i]))

the outpout is:

[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 0 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 0 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 0 (expected 1)

Why are all classes_x zero?

Reply
- Adrian Tam October 13, 2021 at 5:09 am #
  
  Because the prediction here is a binary one, hence predict_x is Nx1 matrix which argmax will only report 0. Your syntax is correct for multi-class, which the neural network has output layer as Dense(n) with n>1
  
  I’ve updated the sample code here to reflect what you should do. Thanks for alerting me.
  
  Reply
christoper October 17, 2021 at 6:41 am #

hello this is helpful. I am studying neural networks and im just a beginner. You said this is mlp type of neural network right? I just want to ask, how about this? What kind of neural network architecture used here? is it rnn? ann? or ltstm? link below:

https://towardsdatascience.com/how-to-create-a-chatbot-with-python-deep-learning-in-less-than-an-hour-56a063bdfc44

Reply
- Adrian Tam October 20, 2021 at 8:52 am #
  
  MLP = Multilayer Perceptron, which usually means a neural network with 3 or more layers. The link you provided use Dense(), which is fully-connected layer. Hence it is also MLP.
  
  Reply
Flo October 25, 2021 at 11:29 pm #

Hi Jason and Adrian, I came across your very nice tutorial, because I have a quite similar problem.

I have a couple of numerical process parameters of an engineering problem (similar to your input parameters here), which I want to check to an outcome value (which is different to your tutorial again a numerical value, not a classification). Can you tell me (or do you even now a accordingly handy tutorial like this one), how I need to modify the code?

Thanks a lot!

Reply
- Adrian Tam October 27, 2021 at 2:23 am #
  
  It sounds to me that it is a regression problem instead of classification problem. In this case, two things you may consider to change
  
  1. The last Dense() layer, you may want a different activation (e.g., linear?) because sigmoidal is bounded between 0 and 1
  2. model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’]) should have a loss and metric changed. For example, you may consider to use MSE because cross entropy and accuracy are measures specific to classification
  
  Reply
Dr Shazia Saqib October 28, 2021 at 3:14 am #

awesome, great service, very helpful, am sharing with my students, Lord Bless you ameen

Reply
veejay November 5, 2021 at 11:23 pm #

Awesome tutorial, very well-detailed. I have a question though,

How to improve Validation Loss and Validation Accuracy? I am very new to Neural Network. I only scratched the surface of it. Weights, biases, activation function, loss function, architectures and how to build layers on keras and other fundamental terminologies (thanks from you and deeplizard tutorials from youtube.) I am studying and practicing it and I want to try and replicate some project and I came across this tutorial from Dataflair where he’s creating a chatbot and I tried to imitate it. LINK: https://data-flair.training/blogs/python-chatbot-project/ .
So from what I have observed and based on my learnings, the model that he created is an ANN-MLP. My problem is, when I trained the model and set the validation_split = 0.3, the training loss and accuracy are good but the validation loss and accuracy do the opposite. I know that it may be an overfitting problem so…
Here’s what I did:
-added regularization with L2
– Slowed the learning rate and I also tried to speed it up
-Dropouts (0.2-0.5)
-Batch Size
-Removing layers
-Adding layers
-Experimented different activation and loss functions (sigmoid, softplus, binary_crossentropy)
-I even tried to add data on my datasets (from 320 to 796 inputs)
I tried all of this but val_loss and val_acc still high and low respectively.

(Best that I did is loss: 0.1/accuracy 98 percent val loss: 1.9/val_accuracy: 52 percent.

while the worst is val_loss: over 3.0 and val_accuracy 35-40 percent )

The dataset that i’m using is from dataflair but I expanded it. here’s my visualized model: https://i.stack.imgur.com/HE1jU.png

Reply
- Adrian Tam November 7, 2021 at 10:35 am #
  
  Can’t really tell what went wrong here. Did you verify the validation loss as you trained it? At first, the training loss and validation loss should be equally bad. How did they progressed in each training epoch? This may give you clues.
  
  Reply
Veejay November 9, 2021 at 6:54 am #

Yes I both trained and validate them. They are equally bad at first and as they progressed, the loss improved by miles but val_loss and val_accuracy improved an inch. T_T

Reply
- Adrian Tam November 14, 2021 at 1:36 pm #
  
  That’s expected. You model was looking at the training loss and try to improve itself, but it was not able to see the validation data so it is harder and slower to improve.
  
  Reply
Mak November 17, 2021 at 6:24 pm #

Your books helped me understand LSTMs greatly, I am having trouble with developing an attention layer, please can you do a tutorial on using Attention/ MultiheadAttention
Thank you.

Reply
- Adrian Tam November 18, 2021 at 5:38 am #
  
  Please see the series: https://machinelearningmastery.com/category/attention/
  
  Reply
Nikhil Gupta November 25, 2021 at 5:47 pm #

The accuracy from ANN for this data set is between 70-78%. Using Logistics Regression, we are getting 78% accuracy for the same dataset. So, what’s the advantage of using ANN?

Reply
- Adrian Tam November 26, 2021 at 2:09 am #
  
  ANN is more flexible. Occam’s razor – you use the simplest model for the job. If logistic regression fits well, you have no reason to use ANN. It use more memory and runs slower.
  
  Reply
Flo December 3, 2021 at 8:38 pm #

Thanks for the Tutorial

I tried your approach and it worked nicely on my data. For a first shot I just used data, which is measured after the process (e.g. process time, temperature difference during the process, etc.). For a further, deeper investigation, I would like to use measured data curves, for example the development of the process temperature by time during the process itself. By use of these curves, I expect a higher degree of information.

Could you provide a hint, how to work with this data? For the first shot I simply generated a table with my process parameters in the first 6 columns and my output value in column 7, which could be easily feeded into the modell.

Thanks a lot!

Reply
- Adrian Tam December 8, 2021 at 6:59 am #
  
  Everything sounds straightforward to me. Did you tried implemented this? Any error?
  
  Reply
Flo December 10, 2021 at 6:32 pm #

To be honest, I have no clue how to provide the data. In the first case, I had a table with 7 columns: 6 Input process parameter and one column with output values.

Now I would like to replace (are add) some input columns with time-recorded data curves, which are somehow tables (first column the timestamp, second column the time-specific process parameter) itself. How do I work with this?

Reply
- Adrian Tam December 15, 2021 at 5:38 am #
  
  Usually I would have pandas to process data and convert it to numpy array before feeding to Keras model. Pandas allows you to manipulate tables easier
  
  Reply
Rick December 28, 2021 at 7:46 am #

May need to adjust the import settings for compatibility with newer Tensorflow versions.

Instead of:
…
from keras.models import Sequential
from keras.layers import Dense

Use:
…
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

Solved my issues with Conda.

Thanks for the excellent tutorials and articles!!

Reply
- James Carmichael December 29, 2021 at 11:44 am #
  
  Thank you for the feedback Rick! I also often try to run code in both Anaconda and Google Colab to identify and correct compatibility issues.
  
  Reply
Preeti February 10, 2022 at 4:18 pm #

My Accuracy: 76.95

Thank you for the code and detailed explanation

Reply
- James Carmichael February 11, 2022 at 8:35 am #
  
  You are very welcome, Preeti! Keep up the great wok!
  
  Reply
Alan March 9, 2022 at 8:46 pm #

Hi James

Great work

Never mind neural networks, this is causing me a lot of deep thinking.

I am running your tutorial on a pi 400 with 64bit OS on Thonny.

Works reasonably well on this machine.

However came across an error in one of your examples … Keras neural network using ‘ pima-indians-diabetes.csv’

” from tensorflow.python.eager.context import get_config
ImportError: cannot import name ‘get_config’ from ‘tensorflow.python.eager.context’ (/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/context.py)”

So discovered that the fault lay with Keras.models and layers and have rejigged the sketch as follows:-

# first neural network with keras tutorial
from numpy import loadtxt
from tensorflow.keras import models,layers #********************
#from keras.models import Sequential #******************

#from keras.layers import Dense #********************
# load the dataset
dataset = loadtxt(‘/home/pi/Documents/pima-indians-diabetes.csv’, delimiter=’,’)
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]
# define the keras model
model = models.Sequential() #********************
model.add(layers.Dense(12, input_dim=8, activation=’relu’)) #********************
model.add(layers.Dense(8, activation=’relu’)) #********************
model.add(layers.Dense(1, activation=’sigmoid’)) #********************
# compile the keras model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(X, y)
print(‘Accuracy: %.2f’ % (accuracy*100))

Now that produces
Accuracy: 74.35

Reply
- James Carmichael March 10, 2022 at 10:37 am #
  
  Hi Alan…Thank you for the feedback and support! Interesting application to the Raspberry Pi! Keep in mind that our implementations may not be fully compatible with the libraries that are developed for that platform. Keep up the great work!
  
  Reply
Nishanth March 14, 2022 at 3:38 am #

Hi,

Amazing tutorial! Simple and easy. I tried the same thing on my dataset but the last for loop does not seem to work. Could pls help me with it?

Here is the for loop:
for i in range(5):
print(‘%s => %d (expected %d)’ % (X[i].tolist(), predictions[i], y[i]))

Thanks

Reply
- James Carmichael March 14, 2022 at 11:48 am #
  
  Hi Nishanth…are you copying and pasting the code or typing it in? Be careful regarding copying and pasting code and how it may affect the code layout as errors may be very difficult to spot visually.
  
  Reply
Nishanth March 14, 2022 at 3:40 am #

Hi here in the comment the print statement looks un-indented but in my code, I indent it and still does not work.

Reply
- James Carmichael March 14, 2022 at 11:52 am #
  
  Hi Nishanth…please see previous replies.
  
  Reply
Nishanth March 14, 2022 at 3:45 am #

Hi,

Amazing Tutorial! Simple and Easy to follow. I tried it on my dataset but the last for loop that prints first 5 examples does not work. It gives me KeyError: 0

Could you help me with it?

Thanks

Reply
- James Carmichael March 14, 2022 at 11:47 am #
  
  Hi Nishanth…please share the full error message so we can better assist you.
  
  Reply
Nishanth March 14, 2022 at 11:41 pm #

Found a way out. Thing is that here the dataset is numpy array and mine was a pandas.DataFrame. Thanks for the help.

Reply
- MK November 6, 2022 at 6:02 pm #
  
  Hi Nishanth,
  
  Would you please share how you fix the Keyerror at last?
  
  Reply
N V Raman April 2, 2022 at 1:51 am #

Hello Jason,

Really wonderful tutorial

When I ran the code everything worked except while printing the predictions I get a key error.

Reply
- James Carmichael April 2, 2022 at 12:18 pm #
  
  Hi N V…Can you provide the exact error message so that we can better assist you?
  
  Reply
Susia April 9, 2022 at 1:30 am #

Hi, I’ve learned the same tutorial to develop the first neural net in Keras in one of your mini_courses. To develop my own model on my own dataset, I’ve tried to adapt this tutorial. The problem is my target Y is count data (number of traffic flow for example). In my case, how to define the activation function for the output layer. Is it relu? How to choose the loss function? I’ve tried MeanSquaredError, the loss value is quite large, or categorical_crossentropy, the loss value is nan. I am considering to order the complete book of Deep Learning With Python. What’s the difference of the tutorials inside the book and the mini_course?

Reply
- James Carmichael April 9, 2022 at 8:34 am #
  
  Hi Susia…The following resource may add clarity in how to choose an activation function:
  
  https://machinelearningmastery.com/choose-an-activation-function-for-deep-learning/
  
  Reply
Nasrin April 23, 2022 at 4:32 am #

I sir, thanks a million for your awesome post
could you please explain how we can divide X and y into the train and test sample in deep learning?
this code is correct here?

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

Reply
- James Carmichael April 24, 2022 at 3:25 am #
  
  Hi Nasrin…the sample code you provided looks accurate. Feel free to implement it and let us know if you encounter any issues.
  
  Reply
Shiva Manhar April 23, 2022 at 3:25 pm #

24/24 [==============================] – 0s 489us/step – loss: 0.4517 – accuracy: 0.7956
Accuracy: 79.56

Reply
Jack Sparrow June 3, 2022 at 5:38 am #

Deep Learning with keras mnist dataset:

from cgi import test
from pyexpat import model
import numpy as np
from keras.models import Sequential
from keras import layers
#from keras.layers import Convolution2D, MaxPooling2D #train on image data
from keras.utils import np_utils #veri dönüşümü için gerekli

from keras.datasets import mnist #image data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

print(“Reshape öncesi”,X_train.shape)

X_train = X_train.reshape(-1, 28, 28, 1)
X_test = X_test.reshape(-1, 28, 28, 1)

print(“Reshape sonrası”,X_train.shape)

X_train = X_train.astype(‘float32’)
X_test = X_test.astype(‘float32′)
X_train /= 255
X_test /= 255

Y_train = np_utils.to_categorical(y_train)
Y_test = np_utils.to_categorical(y_test)

model = Sequential()

model.add(layers.Convolution2D(32, 3, 3, activation=’relu’, input_shape=(28,28,1)))
model.add(layers.Convolution2D(32, 3, 3, activation=’relu’))
model.add(layers.MaxPooling2D(pool_size=(2,2)))
model.add(layers.Dropout(0.25))

model.add(layers.Flatten())
model.add(layers.Dense(128, activation=’relu’))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(10, activation=’softmax’))

model.compile(loss=’categorical_crossentropy’,
optimizer=’adam’,
metrics=[‘accuracy’])

model.fit(X_train, Y_train,
batch_size=32, epochs=10, verbose=1)

test_loss, test_acc = model.evaluate(X_test, Y_test, verbose=0)
print(“Test Loss”, test_loss)
print(“Test Accuracy”,test_acc)

Deep Learning with data_diagnosis dataset:

import imp
from pickletools import optimize
from random import random
from statistics import mode
from tabnanny import verbose
from warnings import filters
from matplotlib.pyplot import axis
import pandas as pd
import numpy as np

dataSet = pd.read_csv(“.\data_diagnosis.csv”)
dataSet.drop([“id”,”Unnamed: 32″],axis=1,inplace=True)

dataSet.diagnosis = [1 if each == “M” else 0 for each in dataSet.diagnosis]
y=dataSet.diagnosis.values
x_data=dataSet.drop([“diagnosis”],axis=1)
x_data.astype(“uint8”)

from sklearn.preprocessing import StandardScaler
scaler=StandardScaler()
x=scaler.fit_transform(x_data)

from keras.utils import to_categorical
Y=to_categorical(y)

from sklearn.model_selection import train_test_split
trainX,testX,trainy,testy=train_test_split(x,Y,test_size=0.2,random_state=42)

trainX=trainX.reshape(trainX.shape[0],testX.shape[1],1)
testX=testX.reshape(testX.shape[0],testX.shape[1],1)

from keras import layers
from keras import Sequential

verbose,epochs,batch_size=0,10,8
n_features,n_outputs=trainX.shape[1],trainy.shape[1]

model= Sequential()
input_shape=(trainX.shape[1],1)
model.add(layers.Conv1D(filters=8,kernel_size=5,activation=’relu’,input_shape=input_shape))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling1D(pool_size=3))
model.add(layers.Conv1D(filters=16,kernel_size=5,activation=’relu’))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling1D(pool_size=2))
model.add(layers.Flatten())
model.add(layers.Dense(200,activation=’relu’))
model.add(layers.Dense(n_outputs,activation=’softmax’))
model.summary()
print(‘başladı’)

import keras
import tensorflow
#model.compile(loss=’categorical_crossentropy’,optimizer=’adam’,metrics=[‘accuracy’])
model.compile(loss=’binary_crossentropy’,
optimizer=tensorflow.keras.optimizers.Adam(),
metrics=[‘accuracy’]) # 编译
dataSet.info()
model.fit(trainX,trainy,epochs=epochs,verbose=1)
_,accuracy=model.evaluate(testX,testy,verbose=0)

print(accuracy)

Reply
- James Carmichael June 3, 2022 at 9:12 am #
  
  Thank you for the feedback Jack! Keep up the great work!
  
  Reply
Jack June 17, 2022 at 5:25 am #

24/24 [==============================] – 0s 1ms/step
[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 1 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 1 (expected 1)

my accuracy is 77.99 but this shows it 100 is this right?

Reply
- James Carmichael June 17, 2022 at 9:28 am #
  
  Thank you for the feedback Jack!
  
  Reply
Nicola Menga June 22, 2022 at 5:53 pm #

Hi.
Thank you for this tutorial. It is very useful.
I have a question. This is a tutorial for a binary classification purpose.
However, I want to build a Feed Forward Neural Network which predicts more than one variable (more than one neuron in the output layer), which have a value between 0 and 1 (for example 0.956, 0.878, 0.897 and so on), unlike the case of this tutorial, in which the variable to be predicted takes only the values 0 or 1.
I tried to apply the network developed in this tutorial for this purpose, but results are bad.
My test dataset have 257 observations. If I apply this network, the prediction array is constituted by 257 values (one for each observation), but these values are all the same (for example 1: 0.985; 2: 0.985; 3: 0.985; …; 256: 0.985; 257: 0.985). I hope I explained.

There is a keras model/function adequate for my problem (i.e. the prediction of a variable which is not 0 or 1)?

Thank you for your help.

Nicola Menga.

Reply
- James Carmichael June 23, 2022 at 10:59 am #
  
  Hi Nicola…Please clarify and/or elaborate on your question so that we may better assist you.
  
  Reply
Sadegh July 7, 2022 at 3:49 am #

Hi there,
I always get warning when I’m using NN model that is made with keras in anaconda’s spyder consul .
The warning is as follow:

WARNING: AutoGraph could not transform <function Model.make_test_function..test_function at 0x0000011A030555E0> and will run it as-is.
Cause: Unable to locate the source code of <function Model.make_test_function..test_function at 0x0000011A030555E0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: lineno is out of bounds
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert

I really appreciate if you can help me out of this.

Reply
sukh August 12, 2022 at 6:22 pm #

hello James Carmichael,

Thanks for your all effort . as a beginner I manage to run your example code and read step by step the function of each line of code . very exiting journey started …..my query is i feed the different data in which first row have 12 variable input and 12th is the output result but in 5th or 6th column have under below. how i handle this types of input in dataset.my dataset error in reading .

19 2 49 156 782 394 296.4 723.7 809.4 29.87 53.78 86
740 366
728 398
659 161
704 220
795 173
784 385
732 282
18 1 60 172 850 1455 794 670 28.44 80.74 90
873
842
817
749
797
849
850
847
842

Reply
- James Carmichael August 13, 2022 at 6:18 am #
  
  Hi sukh…You are very welcome! Are you receiving an error message that you can share? This will allow us to better assist you.
  
  Reply
  - sukh August 13, 2022 at 10:34 pm #
    
    ok thanks. Actually my data file in csv format. i am able to read it . but facing problem in making array. my one input has multiple row and moreover spread into down column . means data is not in single row. Every input have same manner. kindly suggest me to possibility to make arrary in this type. or i need to put data in one cell of column E to make data in one row , last column K is output result. and my second input is started from row no 411. I hope you understand my data input relation. here under code and data.
    
    my query is …can we feed data in this manner ? and if yes then how I will declare my dataset to process further
    
    from numpy import loadtxt
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense
    from google.colab import files
    uploaded = files.upload()
    
    import csv
    
    # opening the CSV file
    with open(‘dataread.csv’, mode =’r’)as file:
    
    # reading the CSV file
    csvFile = csv.reader(file)
    print(csvFile)
    # displaying the contents of the CSV file
    for lines in csvFile:
    print(lines)
    
    19 2 49 156 782 296.4 723.7 809.4 29.87 53.78 86
    740
    728
    659
    704
    795
    784
    732
    744
    764
    777
    749
    700
    729
    722
    741
    790
    783
    736
    744
    781
    810
    745
    722
    734
    736
    750
    706
    744
    789
    851
    813
    750
    783
    786
    758
    731
    742
    708
    733
    720
    673
    689
    729
    700
    781
    786
    758
    717
    773
    802
    726
    719
    734
    707
    678
    754
    747
    715
    771
    830
    786
    751
    773
    811
    824
    820
    772
    760
    814
    735
    687
    726
    771
    733
    773
    822
    858
    806
    756
    783
    775
    776
    739
    730
    796
    775
    754
    721
    744
    764
    793
    742
    734
    774
    802
    759
    735
    744
    767
    735
    723
    691
    748
    719
    749
    846
    822
    749
    753
    825
    854
    817
    754
    737
    785
    803
    785
    746
    736
    783
    741
    737
    694
    814
    754
    761
    814
    823
    785
    733
    759
    786
    814
    763
    792
    851
    813
    795
    751
    759
    780
    760
    738
    760
    801
    767
    738
    673
    697
    673
    664
    691
    783
    821
    823
    807
    746
    775
    822
    827
    763
    732
    756
    750
    814
    766
    733
    772
    813
    792
    722
    777
    793
    813
    757
    747
    817
    805
    788
    802
    754
    772
    788
    847
    781
    749
    763
    814
    838
    748
    749
    760
    788
    720
    685
    697
    658
    684
    807
    843
    759
    730
    750
    807
    774
    748
    715
    779
    803
    818
    755
    768
    800
    787
    759
    798
    838
    843
    775
    801
    814
    750
    716
    745
    758
    779
    721
    717
    768
    744
    773
    758
    724
    730
    774
    744
    772
    733
    663
    671
    654
    762
    820
    818
    797
    770
    847
    827
    818
    751
    726
    760
    779
    804
    790
    755
    768
    820
    812
    852
    759
    787
    825
    782
    766
    746
    808
    793
    791
    745
    787
    800
    844
    733
    739
    780
    783
    739
    726
    745
    796
    800
    752
    796
    804
    813
    735
    726
    739
    699
    665
    648
    678
    779
    801
    798
    822
    772
    824
    837
    795
    739
    714
    771
    802
    761
    727
    773
    789
    917
    876
    788
    788
    810
    790
    770
    789
    787
    771
    743
    796
    848
    853
    769
    807
    817
    831
    817
    766
    817
    766
    707
    668
    702
    821
    817
    828
    799
    765
    795
    817
    798
    751
    792
    832
    831
    776
    764
    806
    811
    760
    747
    802
    823
    755
    754
    800
    823
    792
    750
    805
    818
    793
    752
    748
    741
    736
    736
    685
    749
    719
    766
    905
    857
    760
    741
    774
    815
    773
    746
    778
    846
    825
    775
    800
    819
    767
    780
    804
    896
    812
    757
    811
    819
    817
    779
    774
    791
    818
    770
    754
    771
    786
    753
    744
    793
    805
    799
    
    18 1 79 159 532 1182 1486 1744 51.75 83.64 76
    354
    831
    848
    466
    442
    837
    842
    401
    347
    721
    699
    945
    1001
    869
    837
    889
    935
    823
    876
    817
    821
    951
    878
    929
    799
    790
    849
    838
    822
    957
    933
    803
    767
    840
    905
    794
    710
    756
    1004
    966
    858
    809
    955
    930
    944
    820
    809
    823
    821
    905
    894
    890
    869
    856
    819
    762
    724
    695
    797
    794
    745
    894
    966
    923
    875
    896
    911
    859
    925
    863
    862
    884
    900
    827
    937
    936
    912
    932
    819
    800
    770
    1008
    921
    806
    924
    881
    848
    953
    893
    871
    926
    991
    889
    867
    913
    815
    901
    888
    815
    834
    876
    899
    849
    982
    886
    883
    867
    914
    928
    986
    868
    888
    957
    922
    895
    861
    828
    874
    834
    798
    862
    1016
    864
    904
    926
    838
    939
    924
    885
    890
    941
    897
    863
    1034
    906
    842
    866
    862
    832
    896
    913
    881
    875
    916
    914
    878
    957
    890
    793
    759
    804
    1003
    786
    868
    955
    840
    848
    938
    884
    886
    928
    889
    873
    966
    927
    913
    884
    868
    846
    900
    882
    836
    847
    910
    901
    874
    835
    870
    882
    814
    761
    857
    742
    719
    729
    947
    823
    822
    782
    914
    858
    850
    891
    1003
    836
    1034
    873
    867
    846
    799
    860
    772
    784
    787
    991
    936
    909
    1071
    1039
    1037
    1065
    966
    1022
    1023
    963
    959
    897
    870
    886
    881
    854
    943
    975
    869
    918
    900
    890
    960
    995
    853
    927
    926
    892
    970
    956
    881
    901
    997
    858
    924
    840
    852
    995
    1076
    896
    967
    942
    910
    1050
    994
    993
    1024
    915
    972
    942
    866
    866
    854
    837
    945
    955
    912
    930
    914
    927
    995
    987
    850
    838
    757
    727
    705
    744
    962
    859
    854
    919
    905
    900
    1002
    868
    858
    945
    890
    831
    863
    854
    901
    980
    917
    886
    944
    898
    977
    817
    747
    728
    777
    834
    908
    850
    792
    811
    964
    872
    834
    870
    937
    849
    910
    858
    834
    874
    936
    867
    825
    831
    891
    890
    912
    907
    938
    873
    873
    893
    891
    875
    959
    914
    872
    946
    875
    797
    888
    893
    810
    1069
    977
    925
    900
    874
    
    18 1 60 172 850 1455 794 670 28.44 80.74 90
    873
    842
    817
    749
    797
    849
    850
    847
    842
    809
    779
    739
    737
    746
    763
    854
    935
    911
    863
    832
    820
    775
    756
    819
    820
    810
    787
    766
    837
    843
    867
    820
    749
    726
    759
    823
    763
    761
    769
    767
    767
    736
    796
    864
    871
    833
    780
    785
    741
    697
    659
    659
    696
    794
    975
    866
    784
    820
    825
    800
    780
    752
    812
    775
    741
    709
    676
    675
    656
    674
    686
    691
    694
    714
    707
    743
    753
    741
    712
    717
    733
    730
    735
    735
    759
    750
    746
    750
    739
    775
    757
    715
    703
    730
    831
    844
    811
    749
    775
    795
    826
    819
    812
    820
    878
    925
    885
    840
    796
    794
    830
    870
    876
    863
    846
    815
    825
    919
    910
    859
    803
    795
    839
    887
    844
    813
    841
    891
    854
    836
    806
    785
    813
    855
    880
    816
    854
    886
    897
    811
    811
    847
    873
    841
    774
    735
    750
    820
    805
    824
    832
    828
    832
    916
    903
    894
    854
    817
    846
    859
    891
    891
    852
    836
    841
    840
    820
    839
    845
    871
    894
    856
    850
    869
    876
    859
    858
    812
    738
    745
    843
    860
    836
    847
    841
    845
    856
    910
    969
    953
    923
    860
    835
    821
    814
    844
    895
    936
    914
    866
    841
    824
    804
    844
    921
    935
    915
    855
    860
    884
    881
    850
    824
    821
    861
    941
    869
    825
    852
    868
    865
    854
    872
    898
    888
    868
    839
    835
    841
    822
    792
    825
    829
    806
    757
    763
    790
    868
    782
    776
    785
    729
    719
    716
    805
    761
    754
    825
    755
    724
    742
    766
    763
    743
    823
    889
    851
    825
    873
    837
    790
    813
    822
    869
    871
    824
    825
    893
    859
    881
    853
    810
    824
    835
    835
    851
    843
    806
    746
    730
    716
    753
    885
    886
    829
    795
    816
    849
    831
    870
    854
    808
    754
    783
    820
    740
    770
    787
    830
    858
    820
    805
    820
    847
    834
    855
    862
    837
    841
    824
    799
    751
    770
    773
    774
    865
    1019
    1005
    1028
    993
    939
    900
    897
    873
    829
    836
    875
    884
    916
    937
    892
    829
    812
    825
    801
    824
    1010
    924
    905
    877
    865
    968
    934
    843
    862
    846
    855
    847
    848
    825
    821
    821
    805
    814
    879
    847
    814
    766
    853
    850
    826
    780
    831
    795
    874
    845
    814
    850
    895
    886
    892
    843
    800
    819
    836
    833
    786
    832
    880
    863
    828
    836
    887
    918
    19 2 67 161 837 380.5 385.9 314.9 86
    825
    800
    745
    749
    819
    856
    818
    800
    816
    796
    747
    716
    674
    702
    776
    788
    724
    740
    768
    751
    715
    712
    722
    717
    721
    717
    747
    793
    745
    743
    776
    755
    724
    740
    750
    736
    740
    756
    761
    727
    729
    741
    764
    733
    761
    798
    765
    730
    726
    761
    779
    737
    713
    762
    781
    757
    739
    726
    737
    740
    728
    706
    720
    736
    754
    752
    766
    752
    743
    708
    717
    717
    723
    714
    718
    770
    797
    774
    774
    806
    782
    740
    734
    740
    736
    723
    751
    774
    740
    720
    720
    740
    715
    705
    728
    742
    725
    712
    753
    765
    728
    721
    743
    712
    700
    704
    734
    746
    703
    708
    727
    736
    702
    698
    730
    728
    700
    701
    731
    720
    704
    709
    730
    730
    698
    712
    716
    660
    643
    648
    656
    667
    689
    844
    881
    848
    853
    832
    794
    761
    753
    719
    721
    762
    788
    806
    830
    776
    734
    730
    746
    790
    785
    766
    771
    795
    769
    771
    735
    745
    790
    832
    823
    748
    746
    786
    788
    779
    756
    772
    761
    785
    755
    765
    795
    806
    798
    759
    793
    805
    777
    749
    774
    800
    797
    762
    773
    777
    727
    735
    772
    773
    732
    783
    810
    828
    745
    738
    735
    726
    734
    757
    756
    761
    750
    739
    755
    751
    729
    750
    760
    742
    733
    803
    829
    764
    753
    773
    756
    736
    730
    742
    758
    756
    759
    764
    777
    728
    757
    771
    759
    737
    767
    784
    765
    786
    801
    750
    744
    798
    762
    733
    760
    778
    750
    743
    774
    779
    747
    794
    780
    752
    784
    799
    752
    733
    766
    769
    727
    734
    757
    726
    713
    739
    764
    751
    712
    713
    745
    755
    717
    713
    753
    760
    736
    761
    776
    765
    733
    742
    777
    758
    714
    732
    750
    736
    724
    720
    747
    784
    763
    732
    738
    737
    723
    706
    720
    750
    753
    722
    723
    730
    733
    712
    712
    719
    733
    704
    701
    743
    765
    744
    725
    735
    725
    747
    703
    687
    686
    651
    650
    670
    709
    721
    775
    748
    730
    727
    769
    781
    750
    723
    736
    762
    740
    766
    789
    752
    726
    747
    797
    761
    746
    778
    760
    747
    777
    784
    808
    769
    773
    753
    737
    747
    775
    761
    739
    743
    760
    737
    714
    724
    739
    725
    707
    704
    740
    773
    727
    743
    761
    825
    742
    736
    756
    712
    716
    746
    737
    720
    761
    785
    744
    716
    725
    755
    728
    700
    704
    717
    740
    716
    732
    763
    756
    746
    746
    757
    750
    721
    721
    735
    769
    780
    794
    802
    815
    749
    746
    783
    791
    745
    760
    796
    761
    745
    766
    788
    742
    735
    743
    784
    750
    735
    775
    781
    751
    742
    
    Reply
sukh August 19, 2022 at 9:38 pm #

hello James Carmichael,

I put long data in this panel, looks do not nice. I apologies for this. will take care for future.

further I studied numpy array now and understood.

My query is, my output result is not 0 and 1 like your given programm. if I have output variable like 90 ,110, 112, ……..and i want to trained my model by giving output . and later want to incash the output. would you suggest which model is ok for this type of programm

Reply
J Jara October 9, 2022 at 4:58 pm #

This is a binary classifier. How to create a classifier for data with several classes?

Obviously, I could use one-hot encoding for the classes, and create as many binary classifiers as there are classes, but is there any better alternative?

Reply
- James Carmichael October 10, 2022 at 11:09 am #
  
  Hi J Jara…The following resource may be of interest:
  
  https://machinelearningmastery.com/multi-label-classification-with-deep-learning/
  
  Reply
El November 25, 2022 at 12:47 am #

Hello
I can’t download the dataset, its a lot of numbers, but I didn’t understand how can I download them.

Reply
- James Carmichael November 25, 2022 at 9:18 am #
  
  Hi El…Please clarify what you have done to download the dataset so that we may better assist you.
  
  The following link may link be helpful:
  
  https://www.kaggle.com/datasets/kumargh/pimaindiansdiabetescsv
  
  Reply
sura December 10, 2022 at 7:50 pm #

HI

I use keras model conv1d for raw dataset X_train= (142315, 23)
Y_train = (142315,)
my code

n_timesteps = X_train.shape[1] #23

input_layer = tensorflow.keras.layers.Input(shape=(n_timesteps,1))
conv_layer1 = tensorflow.keras.layers.Conv1D(filters=5,
kernel_size=7,
activation=”relu”)(input_layer)
max_pool1 = tensorflow.keras.layers.MaxPooling1D(pool_size=2, strides=5)(conv_layer1)

conv_layer2 = tensorflow.keras.layers.Conv1D(filters=3,
kernel_size=3,
activation=”relu”)(max_pool1)
flatten_layer = tensorflow.keras.layers.Flatten()(conv_layer2)
dense_layer = tensorflow.keras.layers.Dense(15, activation=”relu”)(flatten_layer)
output_layer = tensorflow.keras.layers.Dense(6, activation=”softmax”)(dense_layer)

model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer)
# Prints a string summary of the network.
model.summary()

and after that i use optimization technological for hyperprameters and when # Returning the details of the best solution. print this error can helpe me?????

error

5121 # Use logits whenever they are available. softmax and sigmoid

ValueError: Shapes (142315,) and (142315, 2) are incompatible

Reply
- James Carmichael December 11, 2022 at 9:35 am #
  
  Hi sura…Thanks for asking.
  
  I’m eager to help, but I just don’t have the capacity to debug code for you.
  
  I am happy to make some suggestions:
  
  Consider aggressively cutting the code back to the minimum required. This will help you isolate the problem and focus on it.
  Consider cutting the problem back to just one or a few simple examples.
  Consider finding other similar code examples that do work and slowly modify them to meet your needs. This might expose your misstep.
  Consider posting your question and code to StackOverflow.
  
  Reply
Niall January 5, 2023 at 3:45 am #

Accuracy : 86% if I run preprocessing transformation with scaler on the dataset and use full dataset for train/prediction.
Accuracy : 84% on train and 81% on test using train:test split (only gets above 77 for me with with scaler on data input).

Great article, clear concise explanation of every line of code and found the extension tips at end of article really helpful and you link to a tutorial guide for each extension suggestion. Love the comprehensive approach taken on this site.

Reply
Jun Ho January 19, 2023 at 5:24 pm #

Hi Jason, may I know what is this type of Neural Network? is a Feedforward, Multilayer Perceptron or else? I feel like it could be Feedforward.

Reply
- Adrian Tam January 20, 2023 at 6:37 am #
  
  This is multilayer perceptron network. But also feedforward network because it is always moving in the forward direction. Sometimes, we use different names to mean the same thing.
  
  Reply
Abdullah February 22, 2023 at 6:10 pm #

In “Load data” you should import the “loadtxt” from “numpy”
Because beginners like me are use to run every piece of code 1 by 1.

Reply
- James Carmichael February 23, 2023 at 8:24 am #
  
  Thank you for your feedback and suggestions Abdullah!
  
  Reply
DEEP HAZRA August 14, 2023 at 11:11 pm #

thanks for knowledge sharing.

Reply
- James Carmichael August 15, 2023 at 10:20 am #
  
  Thank you for your feedback and support Deep Hazra! We appreciate it.
  
  Reply
Sharon Mano September 12, 2023 at 12:34 am #

Hi Jason,

It is a great tutorial. I appreciate the way you had put it together.
Do you have a post on how to couple the trained network to an optimization algorithm to use the network to find the input parameter that results in maximized output value?

Reply
- James Carmichael September 12, 2023 at 10:32 am #
  
  Hi Sharon…The following course may be of interest to you:
  
  https://machinelearningmastery.com/optimization-for-machine-learning-crash-course/
  
  Reply
Alex September 12, 2023 at 10:42 am #

I read the publication by Smith, 1988, titled ‘Using the ADAP learning algorithm to forecast the onset of diabetes mellitus,’ where ‘The diabetes pedigree function’ is used as part of the neural network training. Can you explain the relationship of this function in training deep learning models using Keras?”

Reply
- James Carmichael September 14, 2023 at 9:30 am #
  
  Hi Alex…That is a great question! The following resource may be of interest:
  
  https://www.analyticsvidhya.com/blog/2021/07/diabetes-prediction-with-pycaret/
  
  Reply
Rob February 10, 2024 at 4:38 pm #

Hi

I’ve been getting a error when importing from tensorflow.keras.model
from tensorflow.keras.model import Sequential
gives me the error ‘No module named ”tensorflow.keras.model”’

I’ve had to change the imports to:

from keras.models import Sequential
from keras.layers import Dense

but now not sure if what I’m dong is equivalent, I should note that I am not at the end of the tutorial yet.

I have installed tensorflow 2.15 and keras 2.15. Maybe this is a version mismatch? I tried it with 2.12,2.12 but had the same problem, couldn’t go back any further without downgrading pip

Reply

Navigation

Your First Deep Learning Project in Python with Keras Step-by-Step

Keras Tutorial Overview

Need help with Deep Learning in Python?

1. Load Data

2. Define Keras Model

3. Compile Keras Model

4. Fit Keras Model

5. Evaluate Keras Model

6. Tie It All Together

7. Make Predictions

Keras Tutorial Summary

Keras Tutorial Extensions

Further Reading

Related Tutorials

Books

APIs

More On This Topic

1,171 Responses to Your First Deep Learning Project in Python with Keras Step-by-Step

Leave a Reply Click here to cancel reply.