Keras is a powerful and easy-to-use free open source Python library for developing and evaluating deep learning models.
It is part of the TensorFlow library and allows you to define and train neural network models in just a few lines of code.
In this tutorial, you will discover how to create your first deep learning neural network model in Python using Keras.
Kick-start your project with my new book Deep Learning With Python, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
- Update Feb/2017: Updated prediction example, so rounding works in Python 2 and 3.
- Update Mar/2017: Updated example for the latest versions of Keras and TensorFlow.
- Update Mar/2018: Added alternate link to download the dataset.
- Update Jul/2019: Expanded and added more useful resources.
- Update Sep/2019: Updated for Keras v2.2.5 API.
- Update Oct/2019: Updated for Keras v2.3.0 API and TensorFlow v2.0.0.
- Update Aug/2020: Updated for Keras v2.4.3 and TensorFlow v2.3.
- Update Oct/2021: Deprecated predict_class syntax
- Update Jun/2022: Updated to modern TensorFlow syntax

Develop your first neural network in Python with Keras step-by-step
Photo by Phil Whitehouse, some rights reserved.
Keras Tutorial Overview
There is not a lot of code required, but we will go over it slowly so that you will know how to create your own models in the future.
The steps you will learn in this tutorial are as follows:
- Load Data
- Define Keras Model
- Compile Keras Model
- Fit Keras Model
- Evaluate Keras Model
- Tie It All Together
- Make Predictions
This Keras tutorial makes a few assumptions. You will need to have:
- Python 2 or 3 installed and configured
- SciPy (including NumPy) installed and configured
- Keras and a backend (Theano or TensorFlow) installed and configured
If you need help with your environment, see the tutorial:
Create a new file called keras_first_network.py and type or copy-and-paste the code into the file as you go.
Need help with Deep Learning in Python?
Take my free 2-week email course and discover MLPs, CNNs and LSTMs (with code).
Click to sign-up now and also get a free PDF Ebook version of the course.
1. Load Data
The first step is to define the functions and classes you intend to use in this tutorial.
You will use the NumPy library to load your dataset and two classes from the Keras library to define your model.
The imports required are listed below.
1 2 3 4 5 |
# first neural network with keras tutorial from numpy import loadtxt from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense ... |
You can now load our dataset.
In this Keras tutorial, you will use the Pima Indians onset of diabetes dataset. This is a standard machine learning dataset from the UCI Machine Learning repository. It describes patient medical record data for Pima Indians and whether they had an onset of diabetes within five years.
As such, it is a binary classification problem (onset of diabetes as 1 or not as 0). All of the input variables that describe each patient are numerical. This makes it easy to use directly with neural networks that expect numerical input and output values and is an ideal choice for our first neural network in Keras.
The dataset is available here:
Download the dataset and place it in your local working directory, the same location as your Python file.
Save it with the filename:
1 |
pima-indians-diabetes.csv |
Take a look inside the file; you should see rows of data like the following:
1 2 3 4 5 6 |
6,148,72,35,0,33.6,0.627,50,1 1,85,66,29,0,26.6,0.351,31,0 8,183,64,0,0,23.3,0.672,32,1 1,89,66,23,94,28.1,0.167,21,0 0,137,40,35,168,43.1,2.288,33,1 ... |
You can now load the file as a matrix of numbers using the NumPy function loadtxt().
There are eight input variables and one output variable (the last column). You will be learning a model to map rows of input variables (X) to an output variable (y), which is often summarized as y = f(X).
The variables can be summarized as follows:
Input Variables (X):
- Number of times pregnant
- Plasma glucose concentration at 2 hours in an oral glucose tolerance test
- Diastolic blood pressure (mm Hg)
- Triceps skin fold thickness (mm)
- 2-hour serum insulin (mu U/ml)
- Body mass index (weight in kg/(height in m)^2)
- Diabetes pedigree function
- Age (years)
Output Variables (y):
- Class variable (0 or 1)
Once the CSV file is loaded into memory, you can split the columns of data into input and output variables.
The data will be stored in a 2D array where the first dimension is rows and the second dimension is columns, e.g., [rows, columns].
You can split the array into two arrays by selecting subsets of columns using the standard NumPy slice operator or “:”. You can select the first eight columns from index 0 to index 7 via the slice 0:8. We can then select the output column (the 9th variable) via index 8.
1 2 3 4 5 6 7 |
... # load the dataset dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',') # split into input (X) and output (y) variables X = dataset[:,0:8] y = dataset[:,8] ... |
You are now ready to define your neural network model.
Note: The dataset has nine columns, and the range 0:8 will select columns from 0 to 7, stopping before index 8. If this is new to you, then you can learn more about array slicing and ranges in this post:
2. Define Keras Model
Models in Keras are defined as a sequence of layers.
We create a Sequential model and add layers one at a time until we are happy with our network architecture.
The first thing to get right is to ensure the input layer has the correct number of input features. This can be specified when creating the first layer with the input_shape argument and setting it to (8,)
for presenting the eight input variables as a vector.
How do we know the number of layers and their types?
This is a tricky question. There are heuristics that you can use, and often the best network structure is found through a process of trial and error experimentation (I explain more about this here). Generally, you need a network large enough to capture the structure of the problem.
In this example, let’s use a fully-connected network structure with three layers.
Fully connected layers are defined using the Dense class. You can specify the number of neurons or nodes in the layer as the first argument and the activation function using the activation argument.
Also, you will use the rectified linear unit activation function referred to as ReLU on the first two layers and the Sigmoid function in the output layer.
It used to be the case that Sigmoid and Tanh activation functions were preferred for all layers. These days, better performance is achieved using the ReLU activation function. Using a sigmoid on the output layer ensures your network output is between 0 and 1 and is easy to map to either a probability of class 1 or snap to a hard classification of either class with a default threshold of 0.5.
You can piece it all together by adding each layer:
- The model expects rows of data with 8 variables (the input_shape=(8,) argument).
- The first hidden layer has 12 nodes and uses the relu activation function.
- The second hidden layer has 8 nodes and uses the relu activation function.
- The output layer has one node and uses the sigmoid activation function.
1 2 3 4 5 6 7 |
... # define the keras model model = Sequential() model.add(Dense(12, input_shape=(8,), activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) ... |
Note: The most confusing thing here is that the shape of the input to the model is defined as an argument on the first hidden layer. This means that the line of code that adds the first Dense layer is doing two things, defining the input or visible layer and the first hidden layer.
3. Compile Keras Model
Now that the model is defined, you can compile it.
Compiling the model uses the efficient numerical libraries under the covers (the so-called backend) such as Theano or TensorFlow. The backend automatically chooses the best way to represent the network for training and making predictions to run on your hardware, such as CPU, GPU, or even distributed.
When compiling, you must specify some additional properties required when training the network. Remember training a network means finding the best set of weights to map inputs to outputs in your dataset.
You must specify the loss function to use to evaluate a set of weights, the optimizer used to search through different weights for the network, and any optional metrics you want to collect and report during training.
In this case, use cross entropy as the loss argument. This loss is for a binary classification problems and is defined in Keras as “binary_crossentropy“. You can learn more about choosing loss functions based on your problem here:
We will define the optimizer as the efficient stochastic gradient descent algorithm “adam“. This is a popular version of gradient descent because it automatically tunes itself and gives good results in a wide range of problems. To learn more about the Adam version of stochastic gradient descent, see the post:
Finally, because it is a classification problem, you will collect and report the classification accuracy defined via the metrics argument.
1 2 3 4 |
... # compile the keras model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) ... |
4. Fit Keras Model
You have defined your model and compiled it to get ready for efficient computation.
Now it is time to execute the model on some data.
You can train or fit your model on your loaded data by calling the fit() function on the model.
Training occurs over epochs, and each epoch is split into batches.
- Epoch: One pass through all of the rows in the training dataset
- Batch: One or more samples considered by the model within an epoch before weights are updated
One epoch comprises one or more batches, based on the chosen batch size, and the model is fit for many epochs. For more on the difference between epochs and batches, see the post:
The training process will run for a fixed number of epochs (iterations) through the dataset that you must specify using the epochs argument. You must also set the number of dataset rows that are considered before the model weights are updated within each epoch, called the batch size, and set using the batch_size argument.
This problem will run for a small number of epochs (150) and use a relatively small batch size of 10.
These configurations can be chosen experimentally by trial and error. You want to train the model enough so that it learns a good (or good enough) mapping of rows of input data to the output classification. The model will always have some error, but the amount of error will level out after some point for a given model configuration. This is called model convergence.
1 2 3 4 |
... # fit the keras model on the dataset model.fit(X, y, epochs=150, batch_size=10) ... |
This is where the work happens on your CPU or GPU.
No GPU is required for this example, but if you’re interested in how to run large models on GPU hardware cheaply in the cloud, see this post:
5. Evaluate Keras Model
You have trained our neural network on the entire dataset, and you can evaluate the performance of the network on the same dataset.
This will only give you an idea of how well you have modeled the dataset (e.g., train accuracy), but no idea of how well the algorithm might perform on new data. This was done for simplicity, but ideally, you could separate your data into train and test datasets for training and evaluation of your model.
You can evaluate your model on your training dataset using the evaluate() function and pass it the same input and output used to train the model.
This will generate a prediction for each input and output pair and collect scores, including the average loss and any metrics you have configured, such as accuracy.
The evaluate() function will return a list with two values. The first will be the loss of the model on the dataset, and the second will be the accuracy of the model on the dataset. You are only interested in reporting the accuracy so ignore the loss value.
1 2 3 4 |
... # evaluate the keras model _, accuracy = model.evaluate(X, y) print('Accuracy: %.2f' % (accuracy*100)) |
6. Tie It All Together
You have just seen how you can easily create your first neural network model in Keras.
Let’s tie it all together into a complete code example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# first neural network with keras tutorial from numpy import loadtxt from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense # load the dataset dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',') # split into input (X) and output (y) variables X = dataset[:,0:8] y = dataset[:,8] # define the keras model model = Sequential() model.add(Dense(12, input_shape=(8,), activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) # compile the keras model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) # fit the keras model on the dataset model.fit(X, y, epochs=150, batch_size=10) # evaluate the keras model _, accuracy = model.evaluate(X, y) print('Accuracy: %.2f' % (accuracy*100)) |
You can copy all the code into your Python file and save it as “keras_first_network.py” in the same directory as your data file “pima-indians-diabetes.csv“. You can then run the Python file as a script from your command line (command prompt) as follows:
1 |
python keras_first_network.py |
Running this example, you should see a message for each of the 150 epochs, printing the loss and accuracy, followed by the final evaluation of the trained model on the training dataset.
It takes about 10 seconds to execute on my workstation running on the CPU.
Ideally, you would like the loss to go to zero and the accuracy to go to 1.0 (e.g., 100%). This is not possible for any but the most trivial machine learning problems. Instead, you will always have some error in your model. The goal is to choose a model configuration and training configuration that achieve the lowest loss and highest accuracy possible for a given dataset.
1 2 3 4 5 6 7 8 9 10 11 12 |
... 768/768 [==============================] - 0s 63us/step - loss: 0.4817 - acc: 0.7708 Epoch 147/150 768/768 [==============================] - 0s 63us/step - loss: 0.4764 - acc: 0.7747 Epoch 148/150 768/768 [==============================] - 0s 63us/step - loss: 0.4737 - acc: 0.7682 Epoch 149/150 768/768 [==============================] - 0s 64us/step - loss: 0.4730 - acc: 0.7747 Epoch 150/150 768/768 [==============================] - 0s 63us/step - loss: 0.4754 - acc: 0.7799 768/768 [==============================] - 0s 38us/step Accuracy: 76.56 |
Note: If you try running this example in an IPython or Jupyter notebook, you may get an error.
The reason is the output progress bars during training. You can easily turn these off by setting verbose=0 in the call to the fit() and evaluate() functions; for example:
1 2 3 4 5 6 |
... # fit the keras model on the dataset without progress bars model.fit(X, y, epochs=150, batch_size=10, verbose=0) # evaluate the keras model _, accuracy = model.evaluate(X, y, verbose=0) ... |
Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.
What score did you get?
Post your results in the comments below.
Neural networks are stochastic algorithms, meaning that the same algorithm on the same data can train a different model with different skill each time the code is run. This is a feature, not a bug. You can learn more about this in the post:
The variance in the performance of the model means that to get a reasonable approximation of how well your model is performing, you may need to fit it many times and calculate the average of the accuracy scores. For more on this approach to evaluating neural networks, see the post:
For example, below are the accuracy scores from re-running the example five times:
1 2 3 4 5 |
Accuracy: 75.00 Accuracy: 77.73 Accuracy: 77.60 Accuracy: 78.12 Accuracy: 76.17 |
You can see that all accuracy scores are around 77%, and the average is 76.924%.
7. Make Predictions
The number one question I get asked is:
“After I train my model, how can I use it to make predictions on new data?”
Great question.
You can adapt the above example and use it to generate predictions on the training dataset, pretending it is a new dataset you have not seen before.
Making predictions is as easy as calling the predict() function on the model. You are using a sigmoid activation function on the output layer, so the predictions will be a probability in the range between 0 and 1. You can easily convert them into a crisp binary prediction for this classification task by rounding them.
For example:
1 2 3 4 5 |
... # make probability predictions with the model predictions = model.predict(X) # round predictions rounded = [round(x[0]) for x in predictions] |
Alternately, you can convert the probability into 0 or 1 to predict crisp classes directly; for example:
1 2 3 |
... # make class predictions with the model predictions = (model.predict(X) > 0.5).astype(int) |
The complete example below makes predictions for each example in the dataset, then prints the input data, predicted class, and expected class for the first five examples in the dataset.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# first neural network with keras make predictions from numpy import loadtxt from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense # load the dataset dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',') # split into input (X) and output (y) variables X = dataset[:,0:8] y = dataset[:,8] # define the keras model model = Sequential() model.add(Dense(12, input_shape=(8,), activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) # compile the keras model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) # fit the keras model on the dataset model.fit(X, y, epochs=150, batch_size=10, verbose=0) # make class predictions with the model predictions = (model.predict(X) > 0.5).astype(int) # summarize the first 5 cases for i in range(5): print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i])) |
Running the example does not show the progress bar as before, as the verbose argument has been set to 0.
After the model is fit, predictions are made for all examples in the dataset, and the input rows and predicted class value for the first five examples is printed and compared to the expected class value.
You can see that most rows are correctly predicted. In fact, you can expect about 76.9% of the rows to be correctly predicted based on your estimated performance of the model in the previous section.
1 2 3 4 5 |
[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 0 (expected 1) [1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0) [8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 1 (expected 1) [1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0) [0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 1 (expected 1) |
If you would like to know more about how to make predictions with Keras models, see the post:
Keras Tutorial Summary
In this post, you discovered how to create your first neural network model using the powerful Keras Python library for deep learning.
Specifically, you learned the six key steps in using Keras to create a neural network or deep learning model step-by-step, including:
- How to load data
- How to define a neural network in Keras
- How to compile a Keras model using the efficient numerical backend
- How to train a model on data
- How to evaluate a model on data
- How to make predictions with the model
Do you have any questions about Keras or about this tutorial?
Ask your question in the comments, and I will do my best to answer.
Keras Tutorial Extensions
Well done, you have successfully developed your first neural network using the Keras deep learning library in Python.
This section provides some extensions to this tutorial that you might want to explore.
- Tune the Model. Change the configuration of the model or training process and see if you can improve the performance of the model, e.g., achieve better than 76% accuracy.
- Save the Model. Update the tutorial to save the model to a file, then load it later and use it to make predictions (see this tutorial).
- Summarize the Model. Update the tutorial to summarize the model and create a plot of model layers (see this tutorial).
- Separate, Train, and Test Datasets. Split the loaded dataset into a training and test set (split based on rows) and use one set to train the model and the other set to estimate the performance of the model on new data.
- Plot Learning Curves. The fit() function returns a history object that summarizes the loss and accuracy at the end of each epoch. Create line plots of this data, called learning curves (see this tutorial).
- Learn a New Dataset. Update the tutorial to use a different tabular dataset, perhaps from the UCI Machine Learning Repository.
- Use Functional API. Update the tutorial to use the Keras Functional API for defining the model (see this tutorial).
Further Reading
Are you looking for some more Deep Learning tutorials with Python and Keras?
Take a look at some of these:
Related Tutorials
- 5 Step Life-Cycle for Neural Network Models in Keras
- Multi-Class Classification Tutorial with the Keras Deep Learning Library
- Regression Tutorial with the Keras Deep Learning Library in Python
- How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras
Books
- Deep Learning (Textbook), 2016.
- Deep Learning with Python (my book).
APIs
How did you go? Do you have any questions about deep learning?
Post your questions in the comments below, and I will do my best to help.
The input layer doesn’t have any activation function, but still activation=”relu” is mentioned in the first layer of the model. Why?
Hi Saurav,
The first layer in the network here is technically a hidden layer, hence it has an activation function.
Why have you made it a hidden layer though? the input layer is not usually represented as a hidden layer?
Hi sam,
Note this line:
It does a few things.
Does that help?
Hi Jason,
U have used two different activation functions so how can we know which activation function fit the model?
Sorry, I don’t understand the question.
Hi Jason,
I am interested in deep learning and machine learning. You mentioned “It defines a hidden layer with 12 neurons, connected to the input layer that use relu activation function.” I wonder how can we determine the number of neurons in order to achieve a high accuracy rate of the model?
Thanks a lot!!!
Use trial and error. We cannot specify the “best” number of neurons analytically. We must test.
Sir, thanks for your tutorial. Would you like to make tutorial on stock Data Prediction through Neural Network Model and training this on any stock data. If you have on this so please share the link. Thanks
I am reticent to post tutorials on stock market prediction given the random walk hypothesis of security prices:
https://machinelearningmastery.com/gentle-introduction-random-walk-times-series-forecasting-python/
Hi,
I would like to know more about activation function. How it is working? How many activation functions? Using different activation function How much affect the output of the model?
I would like to also know about the Hidden Layer. How the size of the hidden layer affect the model?
In this tutorial, we use relu in the hidden layers, learn more here:
https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks/
The size of the layer impacts the capacity of the model, learn more here:
https://machinelearningmastery.com/how-to-control-neural-network-model-capacity-with-nodes-and-layers/
hi how use cnn for pixel classification on mhd images
What is pixel classification? What are mhd images?
Hello! I want to know if there’s a way to know the values of all weights after each updation?
Yes, you can save them to file or review them manually.
Often saving is achieved using a checkpoint:
https://machinelearningmastery.com/check-point-deep-learning-models-keras/
runfile(‘C:/Users/Owner/Documents/untitled1.py’, wdir=’C:/Users/Owner/Documents’)
Traceback (most recent call last):
File “”, line 1, in
runfile(‘C:/Users/Owner/Documents/untitled1.py’, wdir=’C:/Users/Owner/Documents’)
File “C:\Users\Owner\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 705, in runfile
execfile(filename, namespace)
File “C:\Users\Owner\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 102, in execfile
exec(compile(f.read(), filename, ‘exec’), namespace)
File “C:/Users/Owner/Documents/untitled1.py”, line 13, in
model.add(Dense(12, input_dim=8, activation=’relu’))
File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\engine\sequential.py”, line 160, in add
name=layer.name + ‘_input’)
File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\engine\input_layer.py”, line 177, in Input
input_tensor=tensor)
File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\legacy\interfaces.py”, line 91, in wrapper
return func(*args, **kwargs)
File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\engine\input_layer.py”, line 86, in __init__
name=self.name)
File “C:\Users\Owner\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py”, line 515, in placeholder
x = tf.placeholder(dtype, shape=shape, name=name)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\array_ops.py”, line 1530, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_array_ops.py”, line 1954, in _placeholder
name=name)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py”, line 767, in apply_op
op_def=op_def)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py”, line 2508, in create_op
set_shapes_for_outputs(ret)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py”, line 1894, in set_shapes_for_outputs
output.set_shape(s)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py”, line 443, in set_shape
self._shape = self._shape.merge_with(shape)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 550, in merge_with
stop = key.stop
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 798, in as_shape
“””Returns this shape as a
TensorShapeProto
.”””File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 431, in __init__
size for one or more dimension. e.g.
TensorShape([None, 256])
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 376, in as_dimension
other = as_dimension(other)
File “C:\Users\Owner\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\tensor_shape.py”, line 32, in __init__
if value is None:
TypeError: int() argument must be a string, a bytes-like object or a number, not ‘TensorShapeProto’
this error occurs when {model.add(Dense(12, input_dim=8, activation=’relu’))} this command is run
any help?
Save all code into a file and run it as follows:
https://machinelearningmastery.com/faq/single-faq/how-do-i-run-a-script-from-the-command-line
Fantastic tutorial. The explanation is simple and precise. Thanks a lot
Thanks!
great arttist
Can you explain how to implement weight regularization into the layers?
Yep, see here:
http://keras.io/regularizers/
hey yo!!! how u r start coding in python
Start here:
https://machinelearningmastery.com/faq/single-faq/how-do-i-get-started-with-python-programming
Import statements if others need them:
from keras.models import Sequential
from keras.layers import Dense, Activation
Thanks.
I had them in Part 6, but I have also added them to Part 1.
Great post!
Is it possible to train a neural network that receives as input a vector x and tries to predict another vector y where both x and y are floats?
Yes, this is called regression:
https://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/
If there are 8 inputs for the first layer then why we have taken them as ’12’ in the following line :
model.add(Dense(12, input_dim=8, init=’uniform’, activation=’relu’))
Hi Aakash.
The input layer is defined by the input_dim parameter, here set to 8.
The first hidden layer has 12 neurons.
I ran your program and i have an error:
ValueError: could not convert string to float:
what could be the reason for this, and how may I solve it.
thanks.
great post by the way.
It might be a copy-paste error. Perhaps try to copy and run the whole example listed in section 6?
Hello sir, I am facing the same problem valueError: could not convert string to float: ‘”6’
also I am running the example from section 6.
I have some suggestions here:
https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
jason can u plzz help me how to code
Sorry, I cannot help you to write code.
Maybe when you set all parameters in an extra column in your *.csv file. Than you schould replace the delimiter from , to ; like:
dataset = numpy.loadtxt(“pima-indians-diabetes.csv”, delimiter=”;”)
This solved the Problem for me.
Thanks for sharing.
thank you for your simple and useful example.
You’re welcome cheikh.
Hello Sir, I am trying to use Keras for NLP , specifically sentence classification. I have given the model building part below. It’s taking quite a lot time to execute. I am using Pycharm IDE.
batch_size = 32
nb_filter = 250
filter_length = 3
nb_epoch = 2
pool_length = 2
output_dim = 5
hidden_dims = 250
# Build the model
model1 = Sequential()
model1.add(Convolution1D(nb_filter, filter_length ,activation=’relu’,border_mode=’valid’,
input_shape=(len(embb_weights),dim), weights=[embb_weights]))
model1.add(Dense(hidden_dims))
model1.add(Dropout(0.2))
model1.add(Activation(‘relu’))
model1.add(MaxPooling1D(pool_length=pool_length))
model1.add(Dense(output_dim, activation=’sigmoid’))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model1.compile(loss=’mean_squared_error’,
optimizer=sgd,
metrics=[‘accuracy’])
You may want a larger network. You may also want to use a standard repeating structure like CNN->CNN->Pool->Dense.
See this post on using a CNN:
https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
Later, you may also want to try some stacked LSTMs.
Hi Jason, thanks for the awesome example. Given that the accuracy of this model is 79.56%. From here on, what steps would you take to improve the accuracy?
Given my nascent understanding of Machine Learning, my initial approach would have been:
Implement forward propagation, then compute the cost function, then implement back propagation, use gradient checking to evaluate my network (disable after use), then use gradient descent.
However, this approach seems arduous compared to using Keras. Thanks for your response.
Hi Andre, indeed Keras makes working with neural nets so much easier. Fun even!
We may be maxing out on this problem, but here is some general advice for lifting performance.
– data prep – try lots of different views of the problem and see which is best at exposing the structure of the problem to the learning algorithm (data transforms, feature engineering, etc.)
– algorithm selection – try lots of algorithms and see which one or few are best on the problem (try on all views)
– algorithm tuning – tune well performing algorithms to get the most out of them (grid search or random search hyperparameter tuning)
– ensembles – combine predictions from multiple algorithms (stacking, boosting, bagging, etc.)
For neural nets, there are a lot of things to tune, I think there are big gains in trying different network topologies (layers and number of neurons per layer) in concert with training epochs and learning rate (bigger nets need more training).
I hope that helps as a start.
Awesome! Thanks Jason =)
You’re welcome Andre.
Some interesting stuff here
https://youtu.be/vq2nnJ4g6N0
Thanks for sharing. What did you like about it?
Hi Jason, it’s a great example but if anyone runs it in an IPython/Jupyter notebook they are likely to encounter an I/O error when running the fit step. This is due to a known bug in IPython.
The solution is to set verbose=0 like this
# Fit the model
model.fit(X, Y, nb_epoch=40, batch_size=10, verbose=0)
Great, thanks for sharing Romilly.
Great example. Have a query though. How do I now give a input and get the output (0 or 1). Can you pls give the cmd for that.
Thanks
You can call model.predict() to get predictions and round on each value to snap to a binary value.
For example, below is a complete example showing you how to round the predictions and print them to console.
Hi, Why you are not using any test set? You are predicting from the training set , I think.
Correct, it is just an example to get you started with Keras.
Jason, I’m not quite understanding how the predicted values ([1.0, 0.0, 1.0, 0.0, 1.0,…) map to the real world problem. For instance, what does that first “1.0” in the results indicate?
I get that it’s a prediction of ‘true’ for diabetes…but to which patient is it predicting that—the first in the list? So then the second result, “0.0,” is the prediction for the second patient/row in the dataset?
Remember the original file has 0 and 1 values in the final class column where 0 is no onset of diabetes and 1 is an onset of diabetes.
We are predicting new values in this column.
We are making predictions for special rows, we pass in their medical info and predict the onset of diabetes. We just happen to do this for a number of rows at a time.
hello jason
i am getting this error while calculating the predictions.
#calculate predictions
predictions = model.predict(X)
#round predictions
rounded = [round(x) for x in predictions]
print(rounded)
—————————————————————————
TypeError Traceback (most recent call last)
in ()
2 predictions = model.predict(X)
3 #round predictions
—-> 4 rounded = [round(x) for x in predictions]
5 print(rounded)
in (.0)
2 predictions = model.predict(X)
3 #round predictions
—-> 4 rounded = [round(x) for x in predictions]
5 print(rounded)
TypeError: type numpy.ndarray doesn’t define __round__ method
Try removing the call to round().
Hi Jason,
Can I ask why you use the same data X you fit the model to do the prediction?
# Fit the model
model.fit(X, Y, epochs = 150, batch_size = 10, verbose = 2)
# calculate predictions
predictions = model.predict(X)
Rachel
It is all I have at hand. X means data matrix.
Replace X in predict() with Xprime or whatever you like.
hii, how will i feed the input (8,125,96,0,0,0.0,0.232,54) to get our output.
predictions = model.predict(X)
i mean insead of X i want to get output of 8,125,96,0,0,0.0,0.232,54.
Wrap your input in an array, n-columns with one row, then pass that to the model.
Does that help?
Hello, trying to use predictions on similar neural network but keep getting errors that input dimension has other shape.
Can you say how array must look on exampled neural network?
For an MLP, data must be organized into a 2d array of samples x features
I am not able to get to the last epoch. Getting error before that:
Epoch 11/150
390/768 [==============>……………]Traceback (most recent call last):.6921
ValueError: I/O operation on closed file
I could resolve this by varying the epoch and batch size.
Now to predict a unknown value, i loaded a new dataset and used predict cmd as below :
dataset_test = numpy.loadtxt(“pima-indians-diabetes_test.csv”,delimiter=”,”) –has only one row
X = dataset_test[:,0:8]
model.predict(X)
But I am getting error :
X = dataset_test[:,0:8]
IndexError: too many indices for array
Can you help pls.
Thanks
I see problems like this when you run from a notebook or from an IDE.
Consider running examples from the console to ensure they work.
Consider tuning off verbose output (verbose=0 in the call to fit()) to disable the progress bar.
Hi Jason!
Loved the tutorial! I have a question however.
Is there a way to save the weights to a file after the model is trained for uses, such as kaggle?
Thanks,
David
Thanks David.
You can save the network weights to file by calling model.save_weights(“model.h5”)
You can learn more in this post:
https://machinelearningmastery.com/save-load-keras-deep-learning-models/
Hey, Jason! Thank you for the awesome tutorial! I’ve use your tutorial to learn about CNN. I have one question for you… Supposing I want to use Keras to classicate images and I have 3 or more classes to classify, How could my algorithm know about this classes? You know, I have to code what is a cat, a dog and a horse. Is there any way to code this? I’ve tried it:
target_names = [‘class 0(Cats)’, ‘class 1(Dogs)’, ‘class 2(Horse)’]
print(classification_report(np.argmax(Y_test,axis=1), y_pred,target_names=target_names))
But my results are not classifying correctly.
precision recall f1-score support
class 0(Cat) 0.00 0.00 0.00 17
class 1(Dog) 0.00 0.00 0.00 14
class 2(Horse) 0.99 1.00 0.99 2526
avg / total 0.98 0.99 0.98 2557
Great question Alex.
This is an example of a multi-class classification problem. You must use a one hot encoding on the output variable to be able to model it with a neural network and specify the number of classes as the number of outputs on the final layer of your network.
I provide a tutorial with the famous iris dataset that has 3 output classes here:
https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
Thank you.
I’ll check it.
No problem Alex.
This was really useful, thank you
I’m using keras (with CNNs) for sentiment classification of documents and I’d like to improve the performance, but I’m completely at a loss when it comes to tuning the parameters in a non-arbitrary way. Could you maybe point me somewhere that will help me go about this in a more systematic fashion? There must be some heuristics or rules-of-thumb that could guide me.
I have a tutorial coming out soon (next week) that provide lots of examples of tuning the hyperparameters of a neural network in Keras, but limited to MLPs.
For CNNs, I would advise tuning the number of repeating layers (conv + max pool), the number of filters in repeating block, and the number and size of dense layers at the predicting part of your network. Also consider using some fixed layers from pre-trained models as the start of your network (e.g. VGG) and try just training some input and output layers around it for your problem.
I hope that helps as a start.
Hello Jason , My Accuracy is : 0.0104 , but yours is 0.7879 and my loss is : -9.5414 . Is there any problem with the dataset ? I downloaded the dataset from a different site .
I think there might be something wrong with your implementation or your dataset. Your numbers are way out.
after training, how i can use the trained model on new sample
You can call model.predict()
See an above comment for a specific code example.
Hi Jason,
i’m a student conducting a research on how to use artificial neural network to predict the business viability of potential software projects.
I intend to use python as a programming language. The application of ANN fascinates me but i’m new to machine learning and python. Can you help suggest how to go about this.
Many thanks
Consider getting a good grounding in how to work through a machine learning problem end to end in python first.
Here is a good tutorial to get you started:
https://machinelearningmastery.com/machine-learning-in-python-step-by-step/
Dear Jeson, this is a great tutorial for beginners. It will satisfy the need of many students who are looking for the initial help. But I have a question. Could you please light on a few things: i) how to test the trained model using test dataset (i.e., loading of test dataset and applied the model and suppose the test file name is test.csv) ii) print the accuracy obtained on test dataset iii) the o/p has more than 2 class (suppose 4-class classification problem).
Please show the whole program to overcome any confusion.
Thanks a lot.
I provide an example elsewhere in the comments, you can also see how to make predictions on new data in this post:
https://machinelearningmastery.com/5-step-life-cycle-neural-network-models-keras/
For an example of multi-class classification, you can see this tutorial:
https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
I am trying to build a Neural Network with some recursive connections but not a full recursive layer, how do I do this in Keras?
I could print a diagram of the network but what I want Basically is that each neuron in the current time frame to know only its own previous output and not the output of all the neurons in the output layer.
I don’t know off hand Doron.
Thanks for replying though, have a good day.
Hello Jason,
This is a great tutorial . Thanks for sharing.
I am having a dataset of 100 finger prints and i want to extract minutiae of 100 finger prints using python ( Keras). Can you please advise where to start? I am really confused.
If your fingerprints are images, you may want to consider using convolutional neural networks (CNNs) that are much better at working image data.
See this tutorial on digit recognition for a start:
https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
Hi Jason
Thanks for this great tutorial, i am new to machine learning i went through your basic tutorial on keras and also handwritten-digit-recognition. I would like to understand how i can train a set of image data, for eg. the set of image data can be some thing like square, circle, pyramid.
pl. let me know how the input data needs to fed to the program and how we need to export the model.
Start by preparing a high-quality dataset.
Hi Jason,
Thanks for the great article. But I had 1 query.
Are there any inbuilt functions in keras that can give me the feature importance for the ANN model?
If not, can you suggest a technique I can use to extract variable importance from the loss function? I am considering an approach similar to that used in RF which involves permuting the values of the selected variable and calculating the relative increase in loss.
Regards,
CM
I don’t believe so CM.
I would suggest using a wrapper method and evaluate subsets of features to develop a feature importance/feature selection report.
I talk a lot more about feature selection in this post:
https://machinelearningmastery.com/an-introduction-to-feature-selection/
I provide an example of feature selection in scikit-learn here:
https://machinelearningmastery.com/feature-selection-machine-learning-python/
I hope that helps as a start.
have you develop any progress for this approach? I also have same problem.
Dear Jason, I am new to Deep learning. Being a novice, I am asking you a technical question which may seem silly. My question is that- can we use features (for example length of the sentence etc.) of a sentence while classifying a sentence ( suppose the o/p are +ve sentence and -ve sentence) using deep neural network?
Great question Kamal, yes you can. I would encourage you to include all such features and see which give you a bump in performance.
Hi, How would I use this on a dataset that has multiple outputs? For example a dataset with output A and B where A could be 0 or 1 and B could be 3 or 4 ?
You could use two neurons in the output layer and normalize the output variables to both be in the range of 0 to 1.
This tutorial on multi-class classification might give you some ideas:
https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
Hi Jason,
The tutorial looks really good but unfortunately I keep getting an error when importing Dense from keras.layers, I get the error : AttributeError: module ‘theano’ has no attribute ‘gof’
I have tried reinstalling Theano but it has not fixed the issue.
Best wishes
Tom
Hi Tom, sorry to hear that. I have not seen this problem before.
Have you searched google? I can see a few posts and it might be related to your version of scipy or similar.
Let me know how you go.
Hey Jason,
Can you please make a tutorial on how to add additional train data into the already trained model? This will be helpful for the bigger data sets. I read that warm start is used for random forest. But not sure how to implement as algorithm. A generalised version of how to implement would be good. Thank You!
Great question Shudhan!
Yes, you could save your weights, load them later into a new network topology and start training on new data again.
I’ll work out an example in coming weeks, time permitting.
Hi Jason,
first of all congratulations for this amazing work that you have done!
Here is my question:
What about if my .csv file includes also both nominal and numerical attributes?
Should I change my nominal values to numerical?
Thank you in advance
Hi Joanna, yes.
You can use a label encoder to convert nominal to integer, and then even convert the integer to one hot encoding.
This post will give you code you can use:
https://machinelearningmastery.com/data-preparation-gradient-boosting-xgboost-python/
A small bug:-
Line 25 : rounded = [round(x) for x in predictions]
should have numpy.round instead, for the code to run!
Great tutorial, regardless. The best i’ve seen for intro to ANN in python. Thanks!
Perhaps it’s your version of Python or environment?
In Python 2.7 the round() function is built-in.
If there is comment for python3, should be better.
#use unmpy.round instead, if using python3,
Thanks for the note AC.
This is simple to grasp! Great post! How can we perform dropout in keras?
Thanks Ash.
You can learn about drop out with Keras here:
https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/
Hello Jason,
You are using model.predict in the end to predict the results. Is it possible to save the model somewhere in the harddisk and transfer it to another machine(turtlebot running on ROS for my instance) and then use the model directly on turtlebot to predict the results?
Please tell me how
Thanking you
Homagni Saha
Hi Homagni, great question.
Absolutely!
Learn exactly how in this tutorial I wrote:
https://machinelearningmastery.com/save-load-keras-deep-learning-models/
Hi Jason,
I implemented you code to begin with. But I am getting an accuracy of 45.18% with the same parameters and everything.
Cant figure out why.
Thanks
There does sound like a problem there Rimi.
Confirm the code and data match exactly.
Hi Jason,
I am little confused with first layer parameters. You said that first layer has 12 neurons and expects 8 input variables.
Why there is a difference between number of neurons, input_dim for first layer.
Regards,
Ankit
Hi Ankit,
The problem has 8 input variables and the first hidden layer has 12 neurons. Inputs are the columns of data, these are fixed. The Hidden layers in general are whatever we design based on whatever capacity we think we need to represent the complexity of the problem. In this case, we have chosen 12 neurons for the first hidden layer.
I hope that is clearer.
Hi,
I have a data , IRIS like data but with more colmuns.
I want to use MLP and DBN/CNNClassifier (or any other Deep Learning classificaiton algorithm) on my data to see how correctly it does classified into 6 groups.
Previously using DEEP LEARNING FOR J, today first time see KERAS.
does KERAS has examples (code examples) of DL Classification algorithms?
Kindly,
Tom
Yes Tom, the example in this post is an example of a neural network (deep learning) applied to a classification problem.
I have installed theano but it gives me the error of tensorflow.is it mendatory to install both packages? because tensorflow is not supported on wndows.the only way to get it on windows is to install virtual machine
Keras will work just fine with Theano.
Just install Theano, and configure Keras to use the Theano backend.
More information about configuring the Keras backend here:
https://machinelearningmastery.com/introduction-python-deep-learning-library-keras/
hey jason I have run your code but got the following error.Although I have aready installed theano backend.help me out.I just stuck.
Using TensorFlow backend.
Traceback (most recent call last):
File “C:\Users\pc\Desktop\first.py”, line 2, in
from keras.models import Sequential
File “C:\Users\pc\Anaconda3\lib\site-packages\keras\__init__.py”, line 2, in
from . import backend
File “C:\Users\pc\Anaconda3\lib\site-packages\keras\backend\__init__.py”, line 64, in
from .tensorflow_backend import *
File “C:\Users\pc\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py”, line 1, in
import tensorflow as tf
ImportError: No module named ‘tensorflow’
>>>
Change the backend used by Keras from TensorFlow to Theano.
You can do this either by using the command line switch or changing the Keras config file.
See the link I posted in the previous post for instructions.
Hello Rumesa!
Have you solved your problem? I have the same one. Everywhere is the same answer with keras.json file or envirinment variable but it doesn’t work. Can you tell me what have worked for you?
Interesting.
Maybe there is an issue with the latest version and a tight coupling to tensorflow? I have not seen this myself.
Perhaps it might be worth testing prior versions of Keras, such as 1.1.0?
Try this:
Hi Jason,
First off, thanks so much for creating these resources, I have been keeping an eye on your newsletter for a while now, and I finally have the free time to start learning more about it myself, so your work has been really appreciated.
My question is: How can I set/get the weights of each hidden node?
I am planning to create several arrays randomized weights, then use a genetic algorithm to see which weight array performs the best and improve over generations. How would be the best way to go about this, and if I use a “relu” activation function, am I right in thinking these randomly generated weights should be between 0 and 0.05?
Many thanks for your help 🙂
Alexon
Thanks Alexon,
You can get and set the weights from a network.
You can learn more about how to do this in the context of saving the weights to file here:
https://machinelearningmastery.com/save-load-keras-deep-learning-models/
I hope that helps as a start, I’d love to hear how you go.
Thats great, thanks for pointing me in the right direction.
I’d be happy to let you know how it goes, but might take a while as this is very much a “when I can find the time” project between jobs 🙂
Cheers!
Nice introduction, thanks!
I’m glad you found it useful Arnaldo.
Good day
I have a question, how can I represent a character as a vector that could be an input for the neural network to predict the word meaning and trained using LSTM
For instance, I have bf to predict boy friend or best friend and similarly I have 2mor to predict tomorrow. I need to encode all the input as a character represented as vector, so that it can be train with RNN/LSTM to predict the output.
Thank you.
Kind Regards
Hi Abbey, You can map characters to integers to get integer vectors.
Thank you Jason, if i map characters to integers value to get vectors using English Alphabets, numbers and special characters
The question is how will LSTM predict the character. Please example in more details for me.
Regards
Hi Abbey,
If your output values are also characters, you can map them onto integers, and reverse the mapping to convert the predictions back to text.
The output value of the characters encoding will be text
Thank you, Jason, if I map characters to integers value to get vectors representation of the informal text using English Alphabets, numbers and special characters
The question is how will LSTM predict the character or words that have close meaning to the input value. Please example in more details for me. I understand how RNN/LSTM work based on your tutorial example but the logic in designing processing is what I am stress with.
Regards
hi Jason,
i am trying to implement CNN one dimention on my data. so, i bluit my network.
the issue is:
def train_model(model, X_train, y_train, X_test, y_test):
X_train = X_train.reshape(-1, 1, 41)
X_test = X_test.reshape(-1, 1, 41)
numpy.random.seed(seed)
model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=100, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print(“Accuracy: %.2f%%” % (scores[1] * 100))
this method above does not work and does not give me any error message.
could you help me with this please?
Hi Ammar, I’m surprised that there is no error message.
Perhaps run from the command line and add some print() statements to see exactly where it stops.
Hi Jason
Great work. I have another doubt. How can we apply this to text mining. I have a csv file containing review document and label. I want to apply classify the documents based on the text available. Can U do this favor.
I would recommend converting the chars to ints and then using an Embedding layer.
Mr Jason, this is great tutorial but I am stack with some errors.
First I can’t load data set correctly, tried to correct error but can’t make it. ( FileNotFoundError: [Errno 2] No such file or directory: ‘pima-indians-diabetes.csv’ ).
Second: While trying to evaluate the model it says (X is not defined) May be this is because uploading failed.
Thanks!
You need to download the file and place it in your current working directory Alex.
Does that help?
Sir, it is now successful….
Thanks!
Glad to hear it Alex.
Hi Jason,
First of all a special thanks to you for providing such a great tutorial. I am very new to machine learning and truly speaking i had no background in data science. The concept of ML overwhelmed me and now i have a desire to be an expert of this field. I need your advice to start from a scratch. Also i am a PhD student in Computer Engineering ( computer hardware )and i want to apply it as a tool for fault detection and testing for ICs.Can you provide me some references on this field?
Hi Bappaditya,
My best advice for getting started is here:
https://machinelearningmastery.com/start-here/#getstarted
I believe machine learning and deep learning are good tools for use on problems in fault detection. A good place to find references is here http://scholar.google.com
Best of luck with your project.
Well as usual in our daily coding life errors happen, now I have this error how can I correct it? Thanks!
” —————————————————————————
NoBackendError Traceback (most recent call last)
in ()
16 import librosa.display
17 audio_path = (‘/Users/MA/Python Notebook/OK.mp3’)
—> 18 y, sr = librosa.load(audio_path)
C:\Users\MA\Anaconda3\lib\site-packages\librosa\core\audio.py in load(path, sr, mono, offset, duration, dtype)
107
108 y = []
–> 109 with audioread.audio_open(os.path.realpath(path)) as input_file:
110 sr_native = input_file.samplerate
111 n_channels = input_file.channels
C:\Users\MA\Anaconda3\lib\site-packages\audioread\__init__.py in audio_open(path)
112
113 # All backends failed!
–> 114 raise NoBackendError()
NoBackendError:
”
That is the error I am getting just when trying to load a song into librosa…
Thanks!! @Jason Brownlee
Sorry, this looks like an issue with your librosa library, not a machine learning issue. I can’t give you expert advice, sorry.
Thanks I have managed to correct the error…
Happy Sunday to you all……
Glad to hear it Alex.
how did you solved the problem?
Hi, Jason, thank you for your amazing examples.
I run the same code on my laptop. But I did not get the same results. What could be the possible reasons?
I am using windows 8.1 64bit+eclipse+anaconda 4.2+theano 0.9.4+CUDA7.5
I got results like follows.
… …
Epoch 145/150
10/768 […………………………] – ETA: 0s – loss: 0.3634 – acc: 0.8000
80/768 [==>………………………] – ETA: 0s – loss: 0.4066 – acc: 0.7750
150/768 [====>…………………….] – ETA: 0s – loss: 0.4059 – acc: 0.8067
220/768 [=======>………………….] – ETA: 0s – loss: 0.4047 – acc: 0.8091
300/768 [==========>……………….] – ETA: 0s – loss: 0.4498 – acc: 0.7867
380/768 [=============>…………….] – ETA: 0s – loss: 0.4595 – acc: 0.7895
450/768 [================>………….] – ETA: 0s – loss: 0.4568 – acc: 0.7911
510/768 [==================>………..] – ETA: 0s – loss: 0.4553 – acc: 0.7882
580/768 [=====================>……..] – ETA: 0s – loss: 0.4677 – acc: 0.7776
660/768 [========================>…..] – ETA: 0s – loss: 0.4697 – acc: 0.7788
740/768 [===========================>..] – ETA: 0s – loss: 0.4611 – acc: 0.7838
768/768 [==============================] – 0s – loss: 0.4614 – acc: 0.7799
Epoch 146/150
10/768 […………………………] – ETA: 0s – loss: 0.3846 – acc: 0.8000
90/768 [==>………………………] – ETA: 0s – loss: 0.5079 – acc: 0.7444
170/768 [=====>……………………] – ETA: 0s – loss: 0.4500 – acc: 0.7882
250/768 [========>…………………] – ETA: 0s – loss: 0.4594 – acc: 0.7840
330/768 [===========>………………] – ETA: 0s – loss: 0.4574 – acc: 0.7818
400/768 [==============>……………] – ETA: 0s – loss: 0.4563 – acc: 0.7775
470/768 [=================>…………] – ETA: 0s – loss: 0.4654 – acc: 0.7723
540/768 [====================>………] – ETA: 0s – loss: 0.4537 – acc: 0.7870
620/768 [=======================>……] – ETA: 0s – loss: 0.4615 – acc: 0.7806
690/768 [=========================>….] – ETA: 0s – loss: 0.4631 – acc: 0.7739
750/768 [============================>.] – ETA: 0s – loss: 0.4649 – acc: 0.7733
768/768 [==============================] – 0s – loss: 0.4636 – acc: 0.7734
Epoch 147/150
10/768 […………………………] – ETA: 0s – loss: 0.3561 – acc: 0.9000
90/768 [==>………………………] – ETA: 0s – loss: 0.4167 – acc: 0.8556
170/768 [=====>……………………] – ETA: 0s – loss: 0.4824 – acc: 0.8059
250/768 [========>…………………] – ETA: 0s – loss: 0.4534 – acc: 0.8080
330/768 [===========>………………] – ETA: 0s – loss: 0.4679 – acc: 0.7848
400/768 [==============>……………] – ETA: 0s – loss: 0.4590 – acc: 0.7950
460/768 [================>………….] – ETA: 0s – loss: 0.4619 – acc: 0.7913
530/768 [===================>……….] – ETA: 0s – loss: 0.4562 – acc: 0.7868
600/768 [======================>…….] – ETA: 0s – loss: 0.4497 – acc: 0.7883
680/768 [=========================>….] – ETA: 0s – loss: 0.4525 – acc: 0.7853
760/768 [============================>.] – ETA: 0s – loss: 0.4568 – acc: 0.7803
768/768 [==============================] – 0s – loss: 0.4561 – acc: 0.7812
Epoch 148/150
10/768 […………………………] – ETA: 0s – loss: 0.4183 – acc: 0.9000
80/768 [==>………………………] – ETA: 0s – loss: 0.3674 – acc: 0.8750
160/768 [=====>……………………] – ETA: 0s – loss: 0.4340 – acc: 0.8250
240/768 [========>…………………] – ETA: 0s – loss: 0.4799 – acc: 0.7583
320/768 [===========>………………] – ETA: 0s – loss: 0.4648 – acc: 0.7719
400/768 [==============>……………] – ETA: 0s – loss: 0.4596 – acc: 0.7775
470/768 [=================>…………] – ETA: 0s – loss: 0.4475 – acc: 0.7809
540/768 [====================>………] – ETA: 0s – loss: 0.4545 – acc: 0.7778
620/768 [=======================>……] – ETA: 0s – loss: 0.4590 – acc: 0.7742
690/768 [=========================>….] – ETA: 0s – loss: 0.4769 – acc: 0.7652
760/768 [============================>.] – ETA: 0s – loss: 0.4748 – acc: 0.7658
768/768 [==============================] – 0s – loss: 0.4734 – acc: 0.7669
Epoch 149/150
10/768 […………………………] – ETA: 0s – loss: 0.3043 – acc: 0.9000
90/768 [==>………………………] – ETA: 0s – loss: 0.4913 – acc: 0.7111
170/768 [=====>……………………] – ETA: 0s – loss: 0.4779 – acc: 0.7588
250/768 [========>…………………] – ETA: 0s – loss: 0.4794 – acc: 0.7640
320/768 [===========>………………] – ETA: 0s – loss: 0.4957 – acc: 0.7562
370/768 [=============>…………….] – ETA: 0s – loss: 0.4891 – acc: 0.7703
450/768 [================>………….] – ETA: 0s – loss: 0.4737 – acc: 0.7867
520/768 [===================>……….] – ETA: 0s – loss: 0.4675 – acc: 0.7865
600/768 [======================>…….] – ETA: 0s – loss: 0.4668 – acc: 0.7833
680/768 [=========================>….] – ETA: 0s – loss: 0.4677 – acc: 0.7809
760/768 [============================>.] – ETA: 0s – loss: 0.4648 – acc: 0.7803
768/768 [==============================] – 0s – loss: 0.4625 – acc: 0.7826
Epoch 150/150
10/768 […………………………] – ETA: 0s – loss: 0.2751 – acc: 1.0000
100/768 [==>………………………] – ETA: 0s – loss: 0.4501 – acc: 0.8100
170/768 [=====>……………………] – ETA: 0s – loss: 0.4588 – acc: 0.8059
250/768 [========>…………………] – ETA: 0s – loss: 0.4299 – acc: 0.8200
310/768 [===========>………………] – ETA: 0s – loss: 0.4298 – acc: 0.8129
380/768 [=============>…………….] – ETA: 0s – loss: 0.4365 – acc: 0.8053
460/768 [================>………….] – ETA: 0s – loss: 0.4469 – acc: 0.7957
540/768 [====================>………] – ETA: 0s – loss: 0.4436 – acc: 0.8000
620/768 [=======================>……] – ETA: 0s – loss: 0.4570 – acc: 0.7871
690/768 [=========================>….] – ETA: 0s – loss: 0.4664 – acc: 0.7783
760/768 [============================>.] – ETA: 0s – loss: 0.4617 – acc: 0.7789
768/768 [==============================] – 0s – loss: 0.4638 – acc: 0.7773
32/768 [>………………………..] – ETA: 0s
448/768 [================>………….] – ETA: 0sacc: 79.69%
There is randomness in the learning process that we cannot control for yet.
See this post:
https://machinelearningmastery.com/randomness-in-machine-learning/
Hello Jason Brownlee,Thx for sharing~
I’m new in deep learning.And I am wondering can what you dicussed here:”Keras” be used to build a CNN in tensorflow and train some csv fiels for classification.May be this is a stupid question,but waiting for you reply.I’m working on my graduation project for Word sense disambiguation with cnn,and just can’t move on.Hope for your heip~Bese wishes!
Sorry Nanya, I’m not sure I understand your question. Are you able to rephrase it?
I’ve just installed Anaconda with Keras and am using python 3.5.
It seems there’s an error with the rounding using Py3 as opposed to Py2. I think it’s because of this change: https://github.com/numpy/numpy/issues/5700
I removed the rounding and just used print(predictions) and it seemed to work outputting floats instead.
Does this look correct?
…
Epoch 150/150
0s – loss: 0.4593 – acc: 0.7839
[[ 0.79361773]
[ 0.10443526]
[ 0.90862554]
…,
[ 0.33652252]
[ 0.63745886]
[ 0.11704451]]
Nice, it does look good!
Hi Jason Brownlee
I tried to modified your exemple for my problem (Letter Recognition ,http://archive.ics.uci.edu/ml/datasets/Letter+Recognition).
My data set look like http://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/letter-recognition.data (T,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8) .I try to split the data in input and ouput like this :
X = dataset[:,1:17]
Y = dataset[:,0]
but a have some error (something related that strings are not recognized) .
I tried to modified each letter whit the ASCII code (A became 65 and so on).The string error disappeared.
The program compiles now but the output look like this :
17445/20000 [=========================>….] – ETA: 0s – loss: -1219.4768 – acc:0.0000e+00
17605/20000 [=========================>….] – ETA: 0s – loss: -1219.4706 – acc:0.0000e+00
17730/20000 [=========================>….] – ETA: 0s – loss: -1219.4566 – acc:0.0000e+00
17890/20000 [=========================>….] – ETA: 0s – loss: -1219.4071 – acc:0.0000e+00
18050/20000 [==========================>…] – ETA: 0s – loss: -1219.4599 – acc:0.0000e+00
18175/20000 [==========================>…] – ETA: 0s – loss: -1219.3972 – acc:0.0000e+00
18335/20000 [==========================>…] – ETA: 0s – loss: -1219.4642 – acc:0.0000e+00
18495/20000 [==========================>…] – ETA: 0s – loss: -1219.5032 – acc:0.0000e+00
18620/20000 [==========================>…] – ETA: 0s – loss: -1219.4391 – acc:0.0000e+00
18780/20000 [===========================>..] – ETA: 0s – loss: -1219.5652 – acc:0.0000e+00
18940/20000 [===========================>..] – ETA: 0s – loss: -1219.5520 – acc:0.0000e+00
19080/20000 [===========================>..] – ETA: 0s – loss: -1219.5381 – acc:0.0000e+00
19225/20000 [===========================>..] – ETA: 0s – loss: -1219.5182 – acc:0.0000e+00
19385/20000 [============================>.] – ETA: 0s – loss: -1219.6742 – acc:0.0000e+00
19535/20000 [============================>.] – ETA: 0s – loss: -1219.7030 – acc:0.0000e+00
19670/20000 [============================>.] – ETA: 0s – loss: -1219.7634 – acc:0.0000e+00
19830/20000 [============================>.] – ETA: 0s – loss: -1219.8336 – acc:0.0000e+00
19990/20000 [============================>.] – ETA: 0s – loss: -1219.8532 – acc:0.0000e+00
20000/20000 [==============================] – 1s – loss: -1219.8594 – acc: 0.0000e+00
18880/20000 [===========================>..] – ETA: 0sacc: 0.00%
I do not understand why. Can you please help me
What version of Python are you running?
Hi Jason,
Since the epoch is set to 150 and batch size is 10, does the training algorithm pick 10 training examples at random in each iteration, given that we had only 768 total in X. Or does it sample randomly after it has finished covering all.
Thanks
Good question,
It iterates over the dataset 150 times and within one epoch it works through 10 rows at a time before doing an update to the weights. The patterns are shuffled before each epoch.
I hope that helps.
Hi Jason
Thanks a lot for this blog. It really helps me to start learning deep learning which was in a planning state for last few months. Your simple enrich blogs are awsome. No questions from my side before completing all tutorials.
One question regarding availability of your book. How can I buy those books from India ?
All my books and training are digital, you can purchase them from here:
https://machinelearningmastery.com/products
Hi Jason, firstly your work here is a fantastic resource and I am very thankful for the effort you put in.
I am a slightly-better-than-beginner at python and an absolute novice at ML, I wonder if you could help me classify my problem and find an angle to work at it from.
My data is thus:
Column Names: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, Result
Values: 4, 4, 6, 6, 3, 2, 5, 5, 0, 0, 0, 0, 0, 0, 0, 4
I want to find the percentage chance of each Column Names category being the Result based off the configuration of all the values present from 1-15. Then if need be compare the configuration of Values with another row of values to find the same, Resulting in the total needed calculation as:
Column Names: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, Result
Values: 4, 4, 6, 6, 3, 2, 5, 5, 0, 0, 0, 0, 0, 0, 0, 4
Values2: 7, 3, 5, 1, 4, 8, 6, 2, 9, 9, 9, 9, 9, 9, 9
I apologize if my explanation is not clear, and appreciate any help you can give me thank you.
Hi Stephen,
This process might help you work through your problem:
https://machinelearningmastery.com/start-here/#process
Specifically the first step in defining your problem.
Let me know how you go.
Thanks Jason for such a nice and concise example.
Just wanted to ask if it is possible to save this model in a file and port it to may be an Android or iOS device? If so, what are the libraries available for the same?
Thanks
Rohit
Thanks Rohit,
Here’s an example of saving a Keras model to file:
https://machinelearningmastery.com/save-load-keras-deep-learning-models/
I don’t know about running Keras on an Android or iOS device. Let me know how you go.
Dear Jason, Thanks for sharing this article.
I am novice to the deep learning, and my apology if my question is not clear. my question is could we call all that functions and program from any .php,.aspx, or .html webpage. i mean i load the variables and other files selection from user interface and then make them input to this functions.
will be waiting for your kind reply.
thanks in advance.
zaheer
Perhaps, this sounds like a systems design question, not really machine learning.
I would suggest you gather requirements, assess risks like any software engineering project.
Hi, Jason
Thank you for your blog! It is wonderful!
I used tensorflow as backend, and implemented the procedures using Jupyter.
I did “source activate tensorflow” -> “ipython notebook”.
I can successfully use Keras and import tensorflow.
However, it seems that such environment doesn’t support pandas and sklearn.
Do you have any way to incorporate pandas, sklearn and keras?
(I wish to use sklearn to revisit the classification problem and compare the accuracy with the deep learning method. But I also wish to put the works together in the same interface.)
Thanks!
Sorry, I do not use notebooks myself. I cannot offer you good advice.
Thanks, Jason!
Actually the problem is not on notebooks. Even I used the terminal mode, i.e. doing “source activate tensorflow” only. It failed to import sklearn. Does that mean tensorflow library is not compatible with sklearn? Thanks again!
Sorry Hsiang, I don’t have experience using sklearn and tensorflow with virtual environments.
Thank you!
You’re welcome Hsiang.
hello sir,
A very informative post indeed . I know my question is a very trivial one but can you please show me how to predict on a explicitly mentioned data tuple say v=[6,148,72,35,0,33.6,0.627,50]
thanks for the tutorial anyway
Hi keshav,
You can make predictions by calling model.predict()
When I rerun the file (without predictions) does it reset the model and weights?
excuse me sir, i wanna ask you a question about this paragraph”dataset = numpy.loadtxt(“pima-indians-diabetes.csv”,delimiter=’,’)”, i used the mac and downloaded the dataset,then i exchanged the text into csv file. Running the program
,hen i got:{Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 12:39:47)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type “copyright”, “credits” or “license()” for more information.
>>>
============ RESTART: /Users/luowenbin/Documents/database_test.py ============
Using TensorFlow backend.
Traceback (most recent call last):
File “/Users/luowenbin/Documents/database_test.py”, line 9, in
dataset = numpy.loadtxt(“pima-indians-diabetes.csv”,delimiter=’,’)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/npyio.py”, line 985, in loadtxt
items = [conv(val) for (conv, val) in zip(converters, vals)]
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/npyio.py”, line 687, in floatconv
return float(x)
ValueError: could not convert string to float: book
>>> }
How can i solve this problem? give me a hand thank you!
Hi Ericson,
Confirm that the contents of “pima-indians-diabetes.csv” meet your expectation of a list of CSV lines.
excuse me sir,when i run this code for my data set ,I encounter this problem…please help me finding solution to this problem
runfile(‘C:/Users/sukhpal/.spyder/temp.py’, wdir=’C:/Users/sukhpal/.spyder’)
Using TensorFlow backend.
Traceback (most recent call last):
File “”, line 1, in
runfile(‘C:/Users/sukhpal/.spyder/temp.py’, wdir=’C:/Users/sukhpal/.spyder’)
File “C:\Users\sukhpal\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 866, in runfile
execfile(filename, namespace)
File “C:\Users\sukhpal\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 87, in execfile
exec(compile(scripttext, filename, ‘exec’), glob, loc)
File “C:/Users/sukhpal/.spyder/temp.py”, line 1, in
from keras.models import Sequential
File “C:\Users\sukhpal\Anaconda2\lib\site-packages\keras\__init__.py”, line 2, in
from . import backend
File “C:\Users\sukhpal\Anaconda2\lib\site-packages\keras\backend\__init__.py”, line 67, in
from .tensorflow_backend import *
File “C:\Users\sukhpal\Anaconda2\lib\site-packages\keras\backend\tensorflow_backend.py”, line 1, in
import tensorflow as tf
ImportError: No module named tensorflow
This is a change with the most recent version of tensorflow, I will investigate and change the example.
For now, consider installing and using an older version of tensorflow.
Great tutorial! Amazing amount of work you’ve put in and great marketing skills (I also have an email list, ebooks and sequence, etc). I ran this in Jupyter notebook… I noticed the 144th epoch (acc .7982) had more accuracy than at 150. Why is that?
P.S. i did this for the print: print(numpy.round(predictions))
It seems to avoid a list of arrays which when printing includes the dtype (messy)
Thanks Will.
The model will fluctuate in performance while learning. You can configure triggered check points to save the model if/when conditions like a decrease in train/validation performance is detected. Here’s an example:
https://machinelearningmastery.com/check-point-deep-learning-models-keras/
Please help me to find out this error
runfile(‘C:/Users/sukhpal/.spyder/temp.py’, wdir=’C:/Users/sukhpal/.spyder’)ERROR: execution aborted
I’m not sure Sukhpal.
Consider getting code working from the command line, I don’t use IDEs myself.
please help me to find this error find this error
Epoch 194/195
195/195 [==============================] – 0s – loss: 0.2692 – acc: 0.8667
Epoch 195/195
195/195 [==============================] – 0s – loss: 0.2586 – acc: 0.8667
195/195 [==============================] – 0s
Traceback (most recent call last):
What was the error exactly Kamal?
sir when i run the code on my data set
then it doesnot show overall accuracy although it shows the accuracy and loss for the whole iterations
I’m not sure I understand your question Kamal, please you could restate it?
Hi Jason, im just starting deep learning in python using keras and theano. I have followed the installation instructions without a hitch. Tested some examples but when i run this one line by line i get a lot of exceptions and errors once i run the “model.fit(X,Y, nb_epochs=150, batch_size=10”
What errors are you getting?
Hi, how do I know what number to use for random.seed() ? I mean you use 7, is there any reason for that? Also is it enough to use it only once, in the beginning of the code?
You can use any number CrisH. The fixed random seed makes the example reproducible.
You can learn more about randomness and random seeds in this post:
https://machinelearningmastery.com/randomness-in-machine-learning/
am new to deep learning and found this great tutorial. keep it up and look forward!!
Thanks!
HI, I have a problem in execution the above example as it. It seems that it’s not running properly and stops at Using TensorFlow backend.
Epoch 147/150
768/768 [==============================] – 0s – loss: 0.4709 – acc: 0.7878
Epoch 148/150
768/768 [==============================] – 0s – loss: 0.4690 – acc: 0.7812
Epoch 149/150
768/768 [==============================] – 0s – loss: 0.4711 – acc: 0.7721
Epoch 150/150
768/768 [==============================] – 0s – loss: 0.4731 – acc: 0.7747
32/768 [>………………………..] – ETA: 0sacc: 76.43%
I am new in this field, could you please guide me about this error.
I also executed on another data set, it stops with the same behavior.
What is the error exactly? The example hangs?
Maybe try the Theano backend and see if that makes a difference. Also make sure all of your libraries are up to date.
Dear Jason,
Thank you so much for your valuable suggestions. I tried Theano backend and also updated all my libraries, but again it hanged at:
768/768 [==============================] – 0s – loss: 0.4656 – acc: 0.7799
Epoch 149/150
768/768 [==============================] – 0s – loss: 0.4589 – acc: 0.7826
Epoch 150/150
768/768 [==============================] – 0s – loss: 0.4611 – acc: 0.7773
32/768 [>………………………..] – ETA: 0sacc: 78.91%
I’m sorry to hear that, I have not seen this issue before.
Perhaps a RAM issue or a CPU overheating issue? Are you able to try different hardware?
Hi!
Were you able to find a solution for that?
I’m having exactly the same problem
( … )
Epoch 149/150
768/768 [==============================] – 0s – loss: 0.4593 – acc: 0.7773
Epoch 150/150
768/768 [==============================] – 0s – loss: 0.4586 – acc: 0.7891
32/768 [>………………………..] – ETA: 0sacc: 76.69%
Hello sir,
i want to ask wether we can convert this code to deep learning wid increasing number of layers..
Sure you can increase the number of layers, try it and see.
hello sir,
could you please tell me how do i determine the no.of neurons in each layer, because i am using a different datset and am unable to know the no.of neurons in each layer
Hi Ananya, great question.
Sorry, there is no good theory on how to configure a neural net.
You can configure the number of neurons in a layer by trial and error. Also consider tuning the number of epochs and batch size at the same time.
thank you so much sir. It worked ! 🙂
Glad to here it Ananya.
Hi Jason,
really helpful blog. I have a question about how much time does it take to converge?
I have a dataset with around 4000 records, 3 input columns and 1 output column. I came up with the following model
def create_model(dropout_rate=0.0, weight_constraint=0, learning_rate=0.001, activation=’linear’):
# create model
model = Sequential()
model.add(Dense(6, input_dim=3, init=’uniform’, activation=activation, W_constraint=maxnorm(weight_constraint)))
model.add(Dropout(dropout_rate))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))
# Optimizer
optimizer = Adam(lr=learning_rate)
# Compile model
model.compile(loss=’binary_crossentropy’, optimizer=optimizer, metrics=[‘accuracy’])
return model
# create model
model = KerasRegressor(build_fn=create_model, verbose=0)
# define the grid search parameters
batch_size = [10]
epochs = [100]
weight_constraint = [3]
dropout_rate = [0.9]
learning_rate = [0.01]
activation = [‘linear’]
param_grid = dict(batch_size=batch_size, nb_epoch=epochs, dropout_rate=dropout_rate, \
weight_constraint=weight_constraint, learning_rate=learning_rate, activation=activation)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=5)
grid_result = grid.fit(X_train, Y_train)
I have a 32 core machine with 64 GB RAM and it does not converge even in more than an hour. I can see all the cores busy, so it is using all the cores for training. However, if I change the input neurons to 3 then it converges in around 2 minutes.
Keras version: 1.1.1
Tensorflow version: 0.10.0rc0
theano version: 0.8.2.dev-901275534cbfe3fbbe290ce85d1abf8bb9a5b203
It’s using Tensorflow backend. Can you help me understand what is going on or point me in the right direction? Do you think switching to theano will help?
Best,
Jayant
This post might help you tune your deep learning model:
https://machinelearningmastery.com/improve-deep-learning-performance/
I hope that helps as a start.
hello sir,
could you please tell me how can i plot the results of the code on a graph . I made a few adjustments to the code so as to run it on a different dataset.
What do you want to plot exactly Animesh?
Accuracy vs no.of neurons in the input layer and the no.of neurons in the hidden layer
sir can u plz explain
the different attributes used in this statement
print(“%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))
precisely,what is model.metrics_names
model.metrics_names is a list of names of the metrics collected during training.
More details here:
https://keras.io/models/sequential/
Hi param,
It is using string formatting. %s formats a string, %.2f formats a floating point value with 2 decimal places, %% includes a percent symbol.
You can learn more about the print function here:
https://docs.python.org/3/library/functions.html#print
More info on string formatting here:
https://pyformat.info/
Hi Jason,
It was an awesome post. Could you please tell me how to we decide the following in a DNN 1. number of neurons in the hidden layers
2. number of hidden layers
Thanks.
Vijin
Great question Vijin.
Generally, trial and error. There are no good theories on how to configure a neural network.
We do cross validation, grid search etc to find the hyper parameters in machine algorithms. Similarly can we do anything to identify the above parameters??
Yes, we can use grid search and tuning for neural nets.
The stochastic nature of neural nets means that each experiment (set of configs) will have to be run many times (30? 100?) so that you can take the mean performance.
More general info on tuning neural nets here:
https://machinelearningmastery.com/improve-deep-learning-performance/
More on randomness and stochastic algorithms here:
https://machinelearningmastery.com/randomness-in-machine-learning/
Jason, Please tell me about these lines in your code:
seed = 7
numpy.random.seed(seed)
What do they do? And why do they do it?
One more question is why do you call the last section Bonus:Make a prediction?
I thought this what ANN was created for. What the point if your network’s output is just what you have already know?
They seed the random number generator so that it produces the same sequence of random numbers each time the code is run. This is to ensure you get the same result as me.
I’m not convinced it works with Keras though.
More on randomness in machine learning here:
https://machinelearningmastery.com/randomness-in-machine-learning/
I was showing how to build and evaluate the model in this tutorial. The part about standalone prediction was an add-on.
what exactly is the work of “seed” in the neural network code? what does it do?
Seed refers to seeding the random number generator so that the same sequence of random numbers is generated each time the example is run.
The aim is to make the examples 100% reproducible, but this is hard with symbolic math libs like Theano and TensorFlow backends.
For more on randomness in machine learning, see this post:
https://machinelearningmastery.com/randomness-in-machine-learning/
hello sir
could you plz tell me what is the role of optimizer and binary_crossentropy exactly? it is written that optimizer is used to search through the weights of the network which weights are we talking about exactly?
Hi Priya,
You can learn more about the fundamentals of neural nets here:
https://machinelearningmastery.com/neural-networks-crash-course/
If I am not mistaken, those lines I commented about used when we write
init = ‘uniform’
?
Could you explain in more details what is the batch size?
Hi Bogdan,
Batch size is how many patterns to show to the network before the weights are updated with the accumulated errors. The smaller the batch, the faster the learning, but also the more noisy the learning (higher variance).
Try exploring different batch sizes and see the effect on the train and test performance over each epoch.
Dear Jason
Firstly, thanks for your great tutorials.
I am trying to classify computer networks packets using first 500 bytes of every packet to identify its protocol. I am trying to use 1d convolution. for simpler task,I just want to do binary classification and then tackle multilabel classification for 10 protocols. Here is my code but the accuracy which is like .63. how can I improve the performance? should I Use RNNs?
########
model=Sequential()
model.add(Convolution1D(64,10,border_mode=’valid’,
activation=’relu’,subsample_length=1, input_shape=(500, 1)))
#model.add(Convolution2D(32,5,5,border_mode=’valid’,input_shape=(1,28,28),))
model.add(MaxPooling1D(2))
model.add(Flatten())
model.add(Dense(200,activation=’relu’))
model.add(Dense(1,activation=’sigmoid’))
model.compile(loss=’binary_crossentropy’,
optimizer=’adam’,metrics=[‘accuracy’])
model.fit(train_set, y_train,
batch_size=250,
nb_epoch=30,
show_accuracy=True)
#x2= get_activations(model, 0,xprim )
#score = model.evaluate(t, y_test, show_accuracy = True, verbose = 0)
#print(score[0])
This post lists some ideas to try an lift performance:
https://machinelearningmastery.com/improve-deep-learning-performance/
Hi Jason, thank you so much for this awesome tutorial. I have just started with python and machine learning.
I am joking with the code doing few changes, for example i have changed..
this:
# create model
model = Sequential()
model.add(Dense(250, input_dim=8, init=’uniform’, activation=’relu’))
model.add(Dense(200, init=’uniform’, activation=’relu’))
model.add(Dense(200, init=’uniform’, activation=’relu’))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))
and this:
model.fit(X, Y, nb_epoch=250, batch_size=10)
then i would like to pass some arrays for prediction so…
new_input = numpy.array([[3,88,58,11,54,24.8,267,22],[6,92,92,0,0,19.9,188,28], [10,101,76,48,180,32.9,171,63], [2,122,70,27,0,36.8,0.34,27], [5,121,72,23,112,26.2,245,30]])
predictions = model.predict(new_input)
print predictions # [1.0, 1.0, 1.0, 0.0, 1.0]
is this correct? In this example i used the same series of training (that have 0 class), but i am getting wrong results. Only one array is correctly predicted.
Thank you so much!
Looks good. Perhaps you could try changing the configuration of your model to make it more skillful?
See this post:
https://machinelearningmastery.com/improve-deep-learning-performance/
hello sir,
could you please tell me to rectify my error below it is raised while model is training:
str(array.shape))
ValueError: Error when checking model input: expected convolution2d_input_1 to have 4 dimensions, but got array with shape (68, 28, 28).
It looks like you are working with CNN, not related to this tutorial.
Consider trying this tutorial to get familiar with CNNs:
https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
I want a neural that can predict sin values. Further from a given data set i need to determine the function(for example if the data is of tan or cos, then how to determine that data is of tan only or cos only)
Thanks in advance
Keras just updated to Keras 2.0. I have an updated version of this code here: https://github.com/sudarshan85/keras-projects/tree/master/mlm/pima_indians
Nice work.
hello sir,
can we use PSO (particle swarm optimisation) in this? if so can you tell how?
Sorry, I don’t have an example of PSO for fitting neural network weights.
hello sir,
what type of neural network is used in this code? as there are 3 types of Neural network that are… feedforward, radial basis function and recurrent neurak network.
A multilayer perceptron (MLP) neural network. A classic type from the 1980s.
got this error while compiling..
sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’
Perhaps confirm that your libraries are all up to date (Keras, Theano or TensorFlow)?
Hi Jason!
I am trying to use two odd frames of a video to predict the even one. Thus I need to give two images as input to the network and get one image as output. Can you help me with the syntax for the first model.add()? I have X_train of dimension (190, 2, 240, 320, 3) where 190 are the number of odd pairs, 2 are the two odd images, and (240,320,3) are the (height, width, depth) of each image.
Hello, Jason,
Thanks for your good tutorial. However i found some issues:
Warnings like these:
1 – Warning (from warnings module):
File “/usr/lib/python2.7/site-packages/keras/legacy/interfaces.py”, line 86
‘
call to the Keras 2 API: ' + signature)
DenseUserWarning: Update your
call to the Keras 2 API:
Dense(12, activation=”relu”, kernel_initializer=”uniform”, input_dim=8)2 - Warning (from warnings module):
File "/usr/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 86
' call to the Keras 2 API: ‘ + signature)
UserWarning: Update your
Dense
call to the Keras 2 API:Dense(8, activation="relu", kernel_initializer="uniform")
3 – Warning (from warnings module):
File “/usr/lib/python2.7/site-packages/keras/legacy/interfaces.py”, line 86
‘
call to the Keras 2 API: ' + signature)
DenseUserWarning: Update your
call to the Keras 2 API:
Dense(1, activation=”sigmoid”, kernel_initializer=”uniform”)3 - Warning (from warnings module):
File "/usr/lib/python2.7/site-packages/keras/models.py", line 826
warnings.warn('The nb_epoch
argument in
fit'
nb_epochUserWarning: The
argument in
fithas been renamed
epochs`.I think these are due to some package update..
But, the output of predictions was an array of zeros…
such as: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ….0.0]
I am running in a Linux Machine, Fedora 24,
Python 2.7.13 (default, Jan 12 2017, 17:59:37)
[GCC 6.3.1 20161221 (Red Hat 6.3.1-1)] on linux2
Why?
Thank you!
These look like warnings related to the recent Keras 2.0 release.
They look like just warning and that you can still run the example.
I do not know why you are getting all zeros. I will investigate.
hello sir,
can you please help me build a recurrent neural network with the above given dataset. i am having a bit trouble in building the layers…
Hi Ananya ,
The Pima Indian diabetes dataset is a binary classification problem. It is not appropriate for a Recurrent Neural Network as there is no sequence information to learn.
sir so could you tell on which type of dataset would the recurrent neural network accurately work? i have the dataset of EEG signals of epileptic patients…will recurrent network work on this?
It may if it is regular enough.
LSTMs are excellent at sequence problems that have regularity or clear signals to detect.
Hi Jason, I have a quick question related to an error I am receiving when running the code in the tutorial…
When I run
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
Python returns the following error:
sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’
Sorry, I have not seen this error Shane.
Perhaps check that your environment is up to date with the latest versions of the deep learning libraries?
Hi Jason,
Thanks for this awesome post.
I ran your code with tensorflow back end, just out of curiosity. The accuracy returned was different every time I ran the code. That didn’t happen with Theano. Can you tell me why?
Thanks in advance!
You will get different accuracy each time you run the code because neural networks are stochastic.
This is not related to the backend (I expect).
More on randomness in machine learning here:
https://machinelearningmastery.com/randomness-in-machine-learning/
Hi Jason,
I’m new to deep learning and learning it from your tutorials, which previously helped me understand Machine Learning very well.
In the following code, I want to know why the number of neurons differ from input_dim in first layer of Nueral Net.
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, init=’uniform’, activation=’relu’))
model.add(Dense(8, init=’uniform’, activation=’relu’))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))
You can specify the number of inputs via “input_dim”, you can specify the number of neurons in the first hidden layer as the first parameter to Dense().
Thanx a lot.
You’re welcome.
Hi Jason
while running this code for k fold cross validation it is not working.please give the code for k fold cross validation in binary class
Generally neural nets are too slow/large for k-fold cross validation.
Nevertheless, you can use a sklearn wrapper for a keras model and use it with any sklearn resampling method:
https://machinelearningmastery.com/evaluate-performance-machine-learning-algorithms-python-using-resampling/
Hi Jason, why i use function evaluate to get accuracy score my model with test dataset, it return result >1, i can’t understand.
Hey Jason, thanks for this great article! I get the following error when running the code above:
TypeError: Received unknown keyword arguments: {‘epochs’: 150}
Any ideas on why that might be? I can’t get ‘epochs’, nb_epochs, etc to work…
You need to update to Keras version 2.0 or higher.
def baseline_model():
# create model
model = Sequential()
model.add(Dense(10, input_dim=25, init=’normal’, activation=’softplus’))
model.add(Dense(3, init=’normal’, activation=’softmax’))
# Compile model
model.compile(loss=’mean_squared_error’, optimizer=’adam’, metrics=[‘accuracy’])
return model
sir here mean_square_error has been used for loss calculation. Is it the same as LMS algorithm. If not, can we use LMS , NLMS or RLS to calculate the loss?
Hello Jason, thank you a lot for this example.
My question is, after I trained the model and an accuracy of 79.2% for example is obtained successfully, how can I test this model on new data?
for example if a new patient with new records appear, I want to guess the result (0 or 1) for him, how can I do that in the code?
You can fit your model on all available training data then make predictions on new data as follows:
Thanks Jason, how can we test if new patient will be diabetic or no (0 or 1) ?
Fit the model on all training data and call:
Dr Jason,
In compiling the model i got below error
TypeError: compile() got an unexpected keyword argument ‘metrics’
unable to resolve the below error
Ensure you have the latest version of Keras, v2.0 or higher.
Hello sir,
Thank you for the post. A quick question, my dataset has 24 input and 1 binary output( 170 instances, 100 epoch , hidden layer=6 and 10 batch, kernel_initializer=’normal’) . I adapted your code using Tensor flow and keras. I am having an accuracy of 98 to 100 percent. I am scared of over-fitting in my model. I need your candid advice. Kind regards sir
Yes, evaluate your model using k-fold cross-validation to ensure you are not tricking yourself.
Thank you sir
Hi Jason,
If I want to use the diabetes dataset (NOT Pima) https://archive.ics.uci.edu/ml/datasets/Diabetes to predict Blood Glucose which tutorials and e-books of yours would I need to start with…. Also, the data in its current format with time, code and value is it usable as is or do I need to convert the data in another format to be able to use it.
Thanks for your help
This process will help you frame and work through your dataset:
https://machinelearningmastery.com/start-here/#process
I hope that helps as a start.
Dr. Jason,
The data is time series(time based data) with categorical(20) with two numbers one for insulin level and another for blood sugar level… Each time series data does not have every categorical data… For example one category is blood sugar before breakfast, another category is blood sugar after breakfast, before lunch and after lunch… Some times some of these category data is missing… I read through the above link, but does not talk about time series, categorical data with some category of data missing what to do in those cases…. Please let me know if any of your books will help clarify these points?
Hi Sethu,
I have many posts on time series that will help. Get started here:
https://machinelearningmastery.com/start-here/#timeseries
With categorical data, I would recommend an integer encoding perhaps followed by a one-hot encoding. You can learn more about these encodings here:
https://machinelearningmastery.com/data-preparation-gradient-boosting-xgboost-python/
I hope that helps.
Hello sir,
Is it compulsory to normalize the data before using ANN model. I read it somewhere I which the author insisted that each attribute be comparable on the scale of [0,1] for a meaningful model. What is your take on that sir. Kind regards.
Yes. You must scale your data to the bounds of the activation used.
Hi Jason, You are simply awesome. I’m one of the many who got benefited from your book “machine learning mastery with python”. I’m working with a medical image classification problem. I have two classes of medical images (each class having 1000 images of 32*32) to be worked upon by the convolutional neural networks. Could you guide me how to load this data to the keras dataset? Or how to use my data while following your simple steps? kindly help.
Load the data as numpy arrays and then you can use it with Keras.
Hello sir,
I adapted your code with the cross validation pipelined with ANN (Keras) for my model. It gave me 100% still. I got the data from UCI ( Chronic Kidney Disease). It was 400 instances, 24 input attributes and 1 binary attribute. When I removed the rows with missing data I was left with 170 instances. Is my dataset too small for (24 input layer, 24 hidden layer and 1 output layer ANN, using adam and kernel initializer as uniform )?
It is not too small.
Generally, the size of the training dataset really depends on how you intend to use the model.
Thank you sir for the response, I guess I have to contend with the over-fitting of my model.
Hi Jason,
Great tutorial. Love the site 🙂
Just a quick query : why have you used adam as an optimizer over sgd? Moreover, when do we use sgd optimization, and what exactly does it involve?
Thanks
ADAM seems to consistently work well with little or no customization.
SGD requires configuration of at least the learning rate and momentum.
Try a few methods and use the one that works best for your problem.
Thanks 🙂
Hello sir,
Good day sir, how can I get all the weights and biases of the keras ANN. Kind regards.
You can save the network weights, see this post:
https://machinelearningmastery.com/save-load-keras-deep-learning-models/
You can also use the API to access the weights directly.
Hi Jason,
I am currently working with the IMDB sentiment analysis problem as mentioned in your book. Am using Anaconda 3 with Python 3.5.2. In an attempt to summarize the review length as you have mentioned in your book, When i try to execute the command:
result = map(len, X)
print(“Mean %.2f words (%f)” % (numpy.mean(result), numpy.std(result)))
it returns the error: unsupported operand type(s) for /: ‘map’ and ‘int’
kindly help with the modified syntax. looking forward…
I’m sorry to hear that. Perhaps comment out that line?
Or change it to remove the formatting and just print the raw mean and stdev values for you to review?
Hello, quite new to Python, Numpy and Keras(background in PHP, MYSQL etc). If there are 8 input variables and 1 output varable(9 total), and the Array indexing starts from zero(from what I’ve gathered it’s a Numpy Array, which is built on Python lists) and the order is [rows, columns], then shouldn’t our input variable(X) be X = dataset[:,0:7] (where we select from the 1st to 8th columns, ie. 0th to 7th indices) and output variable(Y) be Y = dataset[:,8] (where we the 9th column, ie. 8th index)?
You can learn more about array indexing in numpy here:
https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
I’m having troubles with the predictions part. It saves ValueError: Error when checking model input: expected dense_1_input to have shape (None, 502) but got array with shape (170464, 502)
### MAKE PREDICTIONS ###
testset = numpy.loadtxt(“right_stim_FD1.csv”, delimiter=”,”)
A = testset[:,0:502]
B = testset[:,502]
probabilities = model.predict(A, batch_size=10, verbose=1)
predictions = float(round(a) for a in probabilities)
accuracy = numpy.mean(predictions == B)
#round predictions
#rounded = [round(x[0]) for x in predictions]
print(predictions)
print(“Prediction Accuracy: %.2f%%” % (accuracy*100))
It looks like you might be giving the entire dataset as the output (y) rather than just the output variable.
Hi there,
I have a question regarding deep learning. In this tutorial we build a MLP with Keras. Is this Deep Learning or is it just a MLP Backpropagation ?
Deep learning is MLP backprop these days:
https://machinelearningmastery.com/what-is-deep-learning/
Generally, deep learning refers to MLPs with lots of layers.
Hi,
Would you mind if I use this code as an example of a simple network in a school project of mine?
Need to ask before using it, since I cannot find anywhere in this tutorial that you are OK with anyone using the code, and the ethics moment of my course requires me to ask (and of course give credit where credit is due).
Kind regards
Eric T
Yes it’s fine but I take no responsibility and you must credit the source.
I answer this question in my FAQ:
https://machinelearningmastery.com/start-here/#faq
Hi Jason
I have a problem
My Dataset have 500 record. But My teacher want my dataset have 100.000 record. I must have a new algorithm for data generation. Please help me
Can you give a deep cnn code which includes 25 layers , in the first conv layer the filter sizs should be 39×39 woth a total lf 64 filters , in the 2nd conv layer , 21 ×21 with 32 filters , in the 3rd conv layer 11×11 with 64 filters , 4th Conv layer 7×7 with 32 layers . For a input size of image 256×256. Im Competely new in this Deep learning Thing but if you can code that for me it would be a great help. Thanks
Consider using an off-the-shelf model like VGG:
https://keras.io/applications/
I have to follow with the facebook metrics. But the result is very low. Help me.
I changed the input but did not improve
http://archive.ics.uci.edu/ml/datasets/Facebook+metrics
I have a list of suggestions that may help as a start:
https://machinelearningmastery.com/improve-deep-learning-performance/
Hi Jason,
Great Tutorial and thanks for your effort.
I have a question, since I am beginner with keras and tensorflow.
I have installed both of them, keras and tensorflow, the latest version and I have run your example but I get always the same error:
Traceback (most recent call last):
File “CNN.py”, line 18, in
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/models.py”, line 777, in compile
**kwargs)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/engine/training.py”, line 910, in compile
sample_weight, mask)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/engine/training.py”, line 436, in weighted
score_array = fn(y_true, y_pred)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/losses.py”, line 51, in binary_crossentropy
return K.mean(K.binary_crossentropy(y_pred, y_true), axis=-1)
File “/Users/MacBookPro1/.virtualenvs/keras_tf/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py”, line 2771, in binary_crossentropy
logits=output)
TypeError: sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’
Could you help? Thanks
Alessandro
Ouch, I have not seen this error before.
Some ideas:
– Consider trying the theano backend and see if that makes a difference.
– Try searching/posting on the keras user group and slack channel.
– Try searching/posting on stackoverflow or cross validated.
Let me know how you go.
Hi Jason,
I found the issue. The tensorflow installation was outdated; so I have updated it and everything
is working nicely.
Good night,
Alessandro
I’m glad to hear it Alessandro.
Thank you Mr. Brownlee for your wonderful easy to understand explanation
Thnaks.
Hi Jason,
Thank you very much for your wonderful tutorial. I have a question regarding the metrices.Is there default way to declare metrices “Precision” and “Recall” in addtion with the “Accurace”.
Br
WAZED
Yes, see here:
https://keras.io/metrics/
Hi Jason,
please send me a small note containing resources from where i can learn deep learning from scratch. thanks for the wonderful read you had prepared.
Thanks in advance
yes, my email id is chiranjib.konwar@gmail.com
Here:
https://machinelearningmastery.com/start-here/#deeplearning
Why the NN have mistakes many times?
What do you mean exactly?
Hi Jason,
I seem to be getting an error when applying the fit method:
ValueError: Error when checking input: expected dense_1_input to have shape (None, 12) but got array with shape (767, 8)
I looked this up and the most prominent suggestion seemed to be upgrade keras and theno, which I did, but that didn’t resolve the problem.
Ensure you have copied the code exactly from the post.
hi Jason,
I am stuck with an error
TypeError: sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’
my tensor flow and keras virsions are
keras: 2.0.4
Tensorflow: 0.12
I’m sorry to hear that, I have not seen that error before. Perhaps you could post a question to stackoverflow or the keras user group?
can anyone tell me which neural network is being used here? Is it MLP??
Yes, it is a multilayer perceptron (MLP) feedforward neural network.
Hi Jason,
I have run this code successfully on PC with CPU.
If I have to run the same code n another PC which contains GPU, What line should I add to make it sure that it runs on the GPU
The code would stay the same, your configuration of the Keras backend would change.
Please refer to TensorFlow or Theano documentation.
What if I want to train my neural which should detect whether the luggage is abandoned or not ? How do i proceed for it ?
This process will help you work through your predictive modeling problem end to end:
https://machinelearningmastery.com/start-here/#process
Hi
I was build neural machine translation model but the score i was get is 0 i am not sure why
Here is a good list of things to try:
https://machinelearningmastery.com/improve-deep-learning-performance/
HHey Jason , first of all thank you very much from the core of my heart to make me understand this perfectly, I have an error after completing 150 iteration.
File “keras_first_network.py”, line 53, in
print(“\n%s: %.2f” %(model.metrics_names[1]*100))
TypeError: not enough arguments for format string
Sorry Sir my bad , actually I wrote it wrongly
Confirm that you have copied the line exactly:
Hi Dr Jason,
Thanks for the tutorial to get started using Keras.
I used the below snippet to directly load the dataset from the URL rather than downloading and saving as this makes the code more streamlined without having to navigate elsewhere.
# load pima indians dataset
datasource = numpy.DataSource().open(“http://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data”)
dataset = numpy.loadtxt(datasource, delimiter=”,”)
Thanks for the tip.
Thanks for this helpful resource!
I’m glad it helped.
Hi Dr Brownlee,
thank you very much for this great tutorial!
I would be grateful, if you could answer some questions:
1. What does the 7 in “numpy.random.seed(7)” means?
2. In my case I have 3 input neurons and 2 output neurons. Is the correct notation:
X = dataset[:,0:3]
Y = dataset[:,3:4] ?
3. The batch size means how many training data are used in one epoch, am I right?
I have thought we have to use the whole training data set for the training. In this case I would determine the batch size as the number of training data pairs I have achieved through experiments etc.. In your example, does the batch (sized 10) means that the computer always uses the same 10 training data in every epoch or are the 10 training data randomly chosen among all training data before every epoch?
4. When evaluating the model what does the loss means (e.g. in loss: 0.5105 – acc: 0.7396)?
Is it the sum of values of the error function (e.g. mean_squared_error) of the output neurons?
You can use any random seed you like, more here:
https://machinelearningmastery.com/reproducible-results-neural-networks-keras/
You are referring to the columns in your data. Your network will also need to be configured with the correct number of inputs and outputs (e.g. input and output layers).
Batch size is the number of samples in the dataset to work through before updating network weights. One epoch is comprised of one or more batches.
Loss is the term being optimized by the network. Here we use log loss:
https://en.wikipedia.org/wiki/Cross_entropy
Thank you for your response, Dr Brownlee !!
I hope it helps.
Is there anyway to see the relationship between these inputs? Essentially understand which inputs affect the output the most, or perhaps which pairs of inputs affect the output the most?
Maybe pairing this with unsupervised deep learning? I want to have less of a “black box” for the developed network if at all possible. Thank you for your great content!
Yes, try and RFE:
https://machinelearningmastery.com/feature-selection-machine-learning-python/
Hi Jason,
Thank you for sharing your skills and competence.
I want to study the change in weights and predictions between each epoch run.
Have tried to use the model.train_on_batch method and the model.fit method with epoch=1 and batch_size equal all the samples.
But it seems like the model doesn’t save the new updated weights.
I print predictions before and after I dont see a change in the evaluation scores.
Parts of the code is printed below.
Any idea?
Thanks.
# Compile model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# evaluate the model
scores = model.evaluate(X, Y)
print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))
# Run one update of the model trained run with X and compared with Y
model.train_on_batch(X, Y)
# Fit the model
model.fit(X, Y, epochs=1, batch_size=768)
scores = model.evaluate(X, Y)
print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))
Sorry, I have not explored evaluating a Keras model this way.
Perhaps it is a fault, I would recommend preparing the smallest possible example that demonstrates the issue and post to the Keras GitHub issues.
Hi, I tried to apply this to the titanic data set, however the predictions were all 0.4. What do you suggest for:
# create model
model = Sequential()
model.add(Dense(12, input_dim=4, activation=’relu’))
model.add(Dense(4, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’]) #’sgd’
model.fit(X, Y, epochs=15, batch_size=10)
This post will give you some ideas to list the skill of your model:
https://machinelearningmastery.com/improve-deep-learning-performance/
Hi Dr Jason,
This is probably a stupid question but I cannot find out how to do it … and I am beginner on Neural Network.
I have relatively same number of inputs (7) and one output. This output can take numbers between -3000 and +3000.
I want to build a neural network model in python but I don’t know how to do it.
Do you have an example with outputs different from 0-1.
Tanks in advance
Camus
Ensure you scale your data then use the above tutorial to get started.
Hi Jason Brownlee
I am using the same data “pima-indians-diabetes.csv” but all predicted values are less then 1 and are in fraction which could not distinguish any class.
If I round off then all become 0.
I am using model.predict(x) function
You are requested to kindly guide me what I am doing wrong are how can I achieve correct predicted value.
Thank you
Consider you have copied all of the code exactly from the tutorial.
Hello Jason,
Thanks you for your great example. I have some comments.
– Why you have choice “12” inputs hidden layers ? and not 24 / 32 .. it’s arbitary ?
– Same question about epochs and batch_size ?
This value are very sensible !! i have try with 32 inputs first layer , epchos=500 and batch_size=1000 and the result is very differents… i’am at 65% accurancy.
Thx for you help.
Regards.
Yes, it is arbitrary. Tune the parameters of the model to your problem.
Wow, you’re still replying to comments more than a year later!!!… you’re great,, thanks..
Yep.
Thanks for your tutorial, I found it very useful to get me started with Keras. I’ve previously tried TensorFlow, but found it very difficult to work with. I do have a question for you though. I have both Theano and TensorFlow installed, how do I know which back-end Keras is using? Thanks again
Keras will print which backend it uses every time you run your code.
You can change the backend in the Keras configuration file (~/.keras/keras.json) which looks like:
Hello Jason,
My understanding of Machine Learning or evaluating deep learning models is almost 0. But, this article gives me lot of information. It is explained in a simple and easy to understand language.
Thank you very much for this article. Would you suggest any good read to further explore Machine Learning or deep learning models please?
Thanks.
Yes, start right here:
https://machinelearningmastery.com/start-here/#deeplearning
If I have trained prediction models or neural network function scripts. How can I use them to make predictions in an application that will be used by end users? I want to use python but it seems I will have to redo the training in Python again. Is there a way I can rewrite the scripts in Python without retraining and just call the function of predicting?
You need to train and save the final model then load it to make predictions.
This post will make it clear:
https://machinelearningmastery.com/train-final-machine-learning-model/
Jason, I used your tutorial to install everything needed to run this tutorial. I followed your tutorial and ran the resulting program successfully. Can you please describe what the output means? I would like to thank you for your very informative tutorials.
768/768 [==============================] – 0s – loss: 0.4807 – acc: 0.7826
Epoch 148/150
768/768 [==============================] – 0s – loss: 0.4686 – acc: 0.7812
Epoch 149/150
768/768 [==============================] – 0s – loss: 0.4718 – acc: 0.7617
Epoch 150/150
768/768 [==============================] – 0s – loss: 0.4772 – acc: 0.7812
32/768 [>………………………..] – ETA: 0s
acc: 77.99%
It is summarizing the training of the model.
The final line evaluates the accuracy of the model’s predictions – really just to demonstrate how to make predictions.
Well done Shane.
Which output?
Hello Jason, i really liked your Work and it helped me a lot with my first steps.
But i am not really familiar with the numpy stuff:
So here is my Question:
dataset = numpy.loadtxt(“pima-indians-diabetes.csv”, delimiter=”,”)
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
I get that the numpy.loadtxt is extracting the information from the cvs File
but what does the stuff in the Brackets mean like X = dataset[:,0:8]
why the “:” and why , 0:8
its probably pretty dumb but i can’t find a good explanation online 😀
thanks really much!
Good question Bene, it’s called array slicing:
https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
That helped me out tank you Jason 🙂
Can I translate it to Chinese and put it to Internet in order to let other Chinese people can read your article?
No, please do not.
It seems that using this line:
np.random.seed(5)
…is redundant i.e. the Keras output in a loop running the same model with the same configuration will yield a similar variety of results regardless if it’s set at all, or which number it is set to. Or am I missing something?
Deep learning algorithms are stochastic (random within a range). That means that they will make different predictions/learn different things when the same model is trained on the same data. This is a feature:
https://machinelearningmastery.com/randomness-in-machine-learning/
You can fix the random seed to ensure you get the same result, and it is a good idea for tutorials to help beginners out:
https://machinelearningmastery.com/reproducible-results-neural-networks-keras/
When evaluating the skill of a model, I would recommend repeating the experiment n times and taking skill as the average of the runs. See here for the procedure:
https://machinelearningmastery.com/evaluate-skill-deep-learning-models/
Does that help?
Thanks Jason 🙂
I totally get what it should do, but as I had pointed out, it does not do it. If you run the codes you have provided above in a loop for say 10 times. First 10 with random seed set and the other 10 times without that line of code all together. Then compare the result. At least the result I’m getting, is suggesting the effect is not there i.e. both sets of 10 times will have similar variation in the result.
It may suggest that the model is overprescribed and easily addresses the training data.
Nice post by the way > https://machinelearningmastery.com/evaluate-skill-deep-learning-models/
Thanks for sharing it. Been lately thinking about the aspect of accuracy a lot, it seems that at the moment it’s a “hot mess” in terms of the way common tools do it out of the box. I think a lot of non PhD / non expert crowd (most people) will at least initially be easily confused and make the kinds of mistakes you point out in your post.
Thanks for all the amazing contributions you are making in this field!
I’m glad it helped.
Hi Jason,
i’m actually trying to find “spam filter for quora questions” where i have a dataset with label-0’s and 1’s and questions columns. please let me know the approach and path to build a model for this.
Thanks
Sounds like a great project.
The tutorials here on text classification will help:
https://machinelearningmastery.com/start-here/#nlp
Hello Jason, Thanks for a wonderful tutorial.
Can I use Genetic Algorithm for feature selection??
If yes, Could you please provide the link for it???
Thanks in advance.
Sure. Sorry, I don’t have any examples.
Generally, computers are so fast it might be easier to test all combinations in an exhaustive search.
Hi Json,
Thank you for your awesome tutorial.
I have a question for you.
Is there any guideline on how to decide on neuron number for our network.
for example you used 12 for thr 1st layer and 8 for the second layer.
how do you decide on that ?
Thanks
No, there is no way to analytically determine the configuration of the network.
I use trial and error. You can grid search, random search, or copy configurations from tutorials or papers.
Hi Json,
Thanks for a wonderful tutorial.
Run a model generated by a CNN it takes how much ram, cpu ?
Thanks
It depends on the data you are using to fit the model and the size of the model.
Very large models could be 500MB of RAM or more.
Hi ,
Please let me know , how can i visualise the complete neural network in Keras……………….
I am looking for the complete architecture – like number of neurons in the Input Layer, hidden layer , output layer with weights.
Please have a look at the link present below, here someone has created a beutiful visualisation/architecture using neuralnet package in R.
Please let me know, can we create such type of model in KERAS
https://www.r-bloggers.com/fitting-a-neural-network-in-r-neuralnet-package/
Use the Keras visualization API:
https://keras.io/visualization/
Hello ANKUR,,,, how are you?
you have try visualization in keras which is suggested by Jason Brownlee?
if you have tried then please send me code i am also trying but didnot work..
please guide me
Thank you Dr. Brownlee for the great tutorial,
I have a question about your code:
is the argument metrics=[‘accuracy’] necessary in the code and does it change the results of the neural network or is it just for showing me the accuracy during compiling?
thank you!!
No, it just prints out the accuracy of the model at the end of each epoch. Learn more about Keras metrics here:
https://machinelearningmastery.com/custom-metrics-deep-learning-keras-python/
Hi Jason,
your work here is really great. It helped me a lot.
I recently stumbled upon one thing I cannot understand:
For the pimas dataset you state:
<>
When I look at the table of the pimas dataset, the examples are in rows and the features in columns, so your input dimension is the number of columns. As far as I can see, you don’t change the table.
For neural networks, isn’t the input normally: examples = columns, features=rows?
Is this different for Keras? Or can I use both shapes? An if yes, what’s the difference in the construction of the net?
Thank you!!
No, features are columns, rows are instances or examples.
Thanks! 🙂
I had a lot of discussions because of that.
In Andrew Ng new Coursera course it’s explained as examples = columns, features=rows, but he doesn’t use Keras of course, but programms the neural networks from scratch.
I doubt that, I think you may have mixed it up. Columns are never examples.
Thats what I thought, but I looked it up in the notation for the new coursera course (deeplearning.ai) and there it says: m is the numer of examples in the dataset and n is the input size, where X superscript n x m is the input matrix …
But either way, you helped me! Thank you. 🙂
Hi Jason, thank you so much for your tutorial, it helps me a lot. I need your help for the question below:
I copy the code and run it. Although I got the classification results, there were some warning messages in the process. As follows:
Warning (from warnings module):
File “C:\Users\llfor\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\callbacks.py”, line 120
% delta_t_median)
UserWarning: Method on_batch_end() is slow compared to the batch update (0.386946). Check your callbacks.
I don’t know why, and cannot find any answer to this question. I’m looking forward to your reply. Thanks again!
Sorry, I have not seen this message before. It looks like a warning, you might be able to ignore it.
Thanks for your reply. I’m a start-learner on deep learning.I’d like to put it aside temporarily.
Hi Jason,
Great article, thumbs up for that. I am getting this error when I try to run the file on the command prompt. Any suggestions. Thanks for you response.
#######################################################################
C:\Work\ML>python keras_first_network.py
Using TensorFlow backend.
2017-09-22 10:11:11.189829: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\
36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
‘t compiled to use AVX instructions, but these are available on your machine and
could speed up CPU computations.
2017-09-22 10:11:11.190829: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\
36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
‘t compiled to use AVX2 instructions, but these are available on your machine an
d could speed up CPU computations.
32/768 [>………………………..] – ETA: 0s
acc: 78.52%
#######################################################################
Looks like warning messages that you can ignore.
Thanks I got to know what the problem was. According to section 6 I had set verbose argument to 0 while calling “model.fit()”. Now all the epochs are getting printed.
Glad to hear it.
Hi Jason,
Thanks for the amazing article . Clear and straightforward.
I had some problems installing Keras but was advised to prefix
with tf.contrib.keras
so I have code like
model=tf.contrib.keras.models.Sequential()
Dense=tf.contrib.keras.layers.Dense
Now I try to train Keras on some small datafile to see how things work out:
1,1,0,0,8
1,2,1,0,4
1,0,0,1,5
1,0,1,0,7
0,1,0,0,8
1,4,1,0,4
1,0,2,1,1
1,0,1,0,7
The first 4 columns are inputs and the 5-th column is output.
I use the same code for training (adjust number of inputs) as in your article,
but the network only gets to 12.5% accuracy.
Any advise?
Thanks,
Valentin
Thanks Valentin.
I have a good list of suggestions for improving model performance here:
https://machinelearningmastery.com/improve-deep-learning-performance/
Hi Jason,
I tried replacing the pima data with random data as follows:
X_train = np.random.rand(18,61250)
X_test = np.random.rand(18,61250)
Y_train = np.array([0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,])
Y_test = np.array([1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,])
_, input_size = X_train.shape #put this in input_dim in the first dense layer
I took the round() off of the predictions so I could see the full value and then inserted my random test data in model.fit():
predictions = model.predict(X_test)
preds = [x[0] for x in predictions]
print(preds)
model.fit(X_train, Y_train, epochs=100, batch_size=10, verbose=2, validation_data=(X_test,Y_test))
I found something slightly odd; I expected the predicted values to be around 0.50, plus or minus some, but instead, I got this:
[0.49525392, 0.49652839, 0.49729034, 0.49670222, 0.49342978, 0.49490061, 0.49570397, 0.4962129, 0.49774086, 0.49475089, 0.4958384, 0.49506786, 0.49696651, 0.49869373, 0.49537542, 0.49613148, 0.49636957, 0.49723724]
which is near 0.50 but always less than 0.50. I ran this a few times with different random seeds, so it’s not coincidental. Would you have any explanation for why it does this?
Thanks,
Priya
Perhaps calculate the mean of your training data and compare it to the predicted value. It might be simple sampling error.
I found out I was doing predictions before fitting the model. (I suppose that would mean the network hadn’t adjusted to the data’s distribution yet.)
Hello Jason,
I tried to train this model on my laptop, it is working fine. But I tried to train this model on google-cloud with the same instructions as in your example-5. But it is failing.
Can you just let me know, which changes are to required for the model, so that I can train this on cloud.
Sorry, I don’t know about google cloud.
I have instructions here for running on AWS:
https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/
Great post. Thanks for sharing.
You’re welcome.
Hi Jason,
Is there a way to store the model, once it is created so that I can use it for different input data sets as and when needed.
Yes, you can save it to file. See this tutorial:
https://machinelearningmastery.com/save-load-machine-learning-models-python-scikit-learn/
I get a syntax error for the
model.fit() line in this example. Is it due to library conflicts with theano and tensorflow if i have both installed?
Perhaps ensure your environment is up to date and that you copied the code exactly.
This tutorial can help with setting up your environment:
https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
Thanks, fixed!
Glad to hear it.
Hi Jason, thanks for the example.
How would you predict a single element from X? X[0] raises a ValueError
ValueError: Error when checking : expected dense_1_input to have shape (None, 8) but got array with shape (8, 1)
Thanks!
You can reshape it to have 1 row and 8 columns:
This post will give you further advice:
https://machinelearningmastery.com/index-slice-reshape-numpy-arrays-machine-learning-python/
Should it be: X[0].reshape((1,8)) ?
Yep!
Dear Sir ,
I have installed and configured the environment according to your directions but while running the program i have following error
“from keras.utils import np_utils”
What is the error exactly?
Hi Jason, thanks for the great tutorials. I just learnt and repeated the program in your “Your First Machine Learning Project in Python Step-By-Step” without problem. Now trying this one, getting stuck at the line “model = Sequential()” when the Interactive window throws: NameError: name ‘Sequential’ is not defined. tried to google, can’t find a solution. I did import Sequential from keras.models as in ur example code. copy pasted as it is. Thanks in advance for your help.
I’m running ur examples in Anaconda 4.4.0 environment in visual studio community version. relevant packages have been installed as in ur earlier tutorials instructed.
>> # create model
… model = Sequential()
…
Traceback (most recent call last):
File “”, line 2, in
NameError: name ‘Sequential’ is not defined
>>> model.add(Dense(12, input_dim=8, init=’uniform’, activation=’relu’))
Traceback (most recent call last):
File “”, line 1, in
AttributeError: ‘SVC’ object has no attribute ‘add’
This does not look good. Perhaps post the error to stack exchange or other keras support. I have a list of keras support sites here:
https://machinelearningmastery.com/get-help-with-keras/
Looks like you need to install Keras. I have a tutorial here on how to do that:
https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
Ho Jason,
Thanks a lot for this wonderful tutorial.
I have a question:
I want to use your code to predict the classification (1 or 0) of unknown samples. Should I create one common csv file having the train (known) as well as the test (unknown) data. Whereas the ‘classification’ column for the known data will have a known value, 1 or 0, for the unknown data, should I leave the column empty (and let the code decide the outcome)?
Thanks a lot
Great question.
No, you only need the inputs and the model can predict the outputs, call model.predict(X).
Also, this post will give a general idea on how to fit a final model:
https://machinelearningmastery.com/train-final-machine-learning-model/
Hi Jason,
This is really cool! I am blown away! Thanks so much for making it so simple for a beginner to have some hands on. I have a couple questions:
1) where are the weights, can I save and/or retrieve them?
2) if I want to train images with dogs and cats and later ask the neural network whether a new image has a cat or a dog, how do I get my input image to pass as an array and my output result to be “cat” or “dog”?
Thanks again and great job!
The weights are in the model, you can save them:
https://machinelearningmastery.com/save-load-keras-deep-learning-models/
Yes, you would save your model, then call model.predict() on the new data.
Hi Jason,
Are you familiar with a python tool/package that can build neural network as in the tutorial, but suitable for data stream mining?
Thanks,
Michael
Not really, sorry.
Hi, there. Could you please clarify why exactly you’ve built your network with 12 neurons in the first layer?
“The first layer has 12 neurons and expects 8 input variables. The second hidden layer has 8 neurons and finally, the output layer has 1 neuron to predict the class (onset of diabetes or not)…”
Should’nt it have 8 neurons at the start?
Thanks
The input layer has 8, the first hidden layer has 12. I chose 12 through a little trial and error.
Hi Jason,
Do you have or else could you recommend a beginner’s level image segmentation approach that uses deep learning? For example, I want to train some neural net to automatically “find” a particular feature out of an image.
Thanks!
Sorry, I don’t have image segmentation examples, perhaps in the future.
Hi Jason,
I just started my DL training a few weeks ago. According to what I learned in course, in order to train the parameters for the NN, we need to run the Forward and Backward propagation; however, looking at your Keras example, i don’t find any of these propagation processes. Does it mean that Keras has its own mechanism to find the parameters instead of using Forward and Backward propagation?
Thanks!
It is performing those operations under the covers for you.
Hi Jason,
Can you explain why I got the following output:
ValueError Traceback (most recent call last)
in ()
—-> 1 model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
2 model.fit(X, Y, epochs=150, batch_size=10)
3 scores = model.evaluate(X, Y)
4 print(“\n%s: %.2f%%” % (model.metrics_names[1], scores[1]*100))
/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/models.py in compile(self, optimizer, loss, metrics, sample_weight_mode, **kwargs)
545 metrics=metrics,
546 sample_weight_mode=sample_weight_mode,
–> 547 **kwargs)
548 self.optimizer = self.model.optimizer
549 self.loss = self.model.loss
/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/engine/training.py in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, **kwargs)
620 loss_weight = loss_weights_list[i]
621 output_loss = weighted_loss(y_true, y_pred,
–> 622 sample_weight, mask)
623 if len(self.outputs) > 1:
624 self.metrics_tensors.append(output_loss)
/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/engine/training.py in weighted(y_true, y_pred, weights, mask)
322 def weighted(y_true, y_pred, weights, mask=None):
323 # score_array has ndim >= 2
–> 324 score_array = fn(y_true, y_pred)
325 if mask is not None:
326 # Cast the mask to floatX to avoid float64 upcasting in theano
/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/objectives.py in binary_crossentropy(y_true, y_pred)
46
47 def binary_crossentropy(y_true, y_pred):
—> 48 return K.mean(K.binary_crossentropy(y_pred, y_true), axis=-1)
49
50
/Users/badrshomrani/anaconda/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in binary_crossentropy(output, target, from_logits)
1418 output = tf.clip_by_value(output, epsilon, 1 – epsilon)
1419 output = tf.log(output / (1 – output))
-> 1420 return tf.nn.sigmoid_cross_entropy_with_logits(output, target)
1421
1422
/Users/badrshomrani/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/nn_impl.py in sigmoid_cross_entropy_with_logits(_sentinel, labels, logits, name)
147 # pylint: disable=protected-access
148 nn_ops._ensure_xent_args(“sigmoid_cross_entropy_with_logits”, _sentinel,
–> 149 labels, logits)
150 # pylint: enable=protected-access
151
/Users/badrshomrani/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py in _ensure_xent_args(name, sentinel, labels, logits)
1696 if sentinel is not None:
1697 raise ValueError(“Only call
%s
with ”-> 1698 “named arguments (labels=…, logits=…, …)” % name)
1699 if labels is None or logits is None:
1700 raise ValueError(“Both labels and logits must be provided.”)
ValueError: Only call
sigmoid_cross_entropy_with_logits
with named arguments (labels=…, logits=…, …)Perhaps double check you have the latest versions of the keras and tensorflow libraries installed?!
keras was outdated
Glad to hear you fixed it.
Hi Jason, thanks for your short tutorial, helps a lot to actually get your hands dirty with a simple example.
I have tried 5 different parameters and got some interesting results to see what would happen. Unfortunately, I didnt record running time.
Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7
number of layers 3 3 3 3 3 3 4
Train set 768 768 768 768 768 768 768
Iterations 150 100 1000 1000 1000 150 150
Rate of update 10 10 10 5 1 1 5
Errors 173 182 175 139 161 169 177
Values 768 768 768 768 768 768 768
% Error 23,0000% 23,6979% 22,7865% 18,0990% 20,9635% 22,0052% 23,0469%
I can’t seem to see a trend here.. That could put me on the right track to adjust my hyperparameters.
Do you have any advice on that?
Something is wrong. Here is a good list of things to try:
https://machinelearningmastery.com/improve-deep-learning-performance/
Hi, I try to implement the above example with fer2013.csv but I receive an error, it is possible to help me to implement this correctly?
Sorry, I cannot debug your code.
What is the problem exactly?
Hello,
i have a a bit general question.
I have to do a forecasting for restaurant sales (meaning that I have to predict 4 meals based on a historical daily sales data), weather condition (such as temperature, rain, etc), official holiday and in-off-season. I have to perform that forecasting using neuronal networks.
I am unfortunately not a very skilled in python. On my computer I have Python 2.7 and I have install anaconda. I am trying to learn exercising with your codes, Mr. Brownlee. But somehow I can not run the code at all (in Spyder). Can you tell me what kind of version of python and anaconda I have to install on my computer and in which environment (jupiterlab,notebook,qtconsole, spyder, etc) I can run the code, so to work and not to give error from the very beginning?
I will be very thankful for your response
KG
Tanya
Perhaps this tutorial will help you setup and confirm your environment:
https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
I would also recommend running code from the command like as IDEs and notebooks can introduce and hide errors.
Hi Dr. Brownlee.
I looked over the tutorial and I had a question regarding reading the data from a binary file? For instance I working on solving the sliding tiled n-puzzle using neural networks, but I seem to have trouble to getting my data which is in a binary file and it generates the number of move required for the n-puzzle to be solve in. Am not sure if you have dealt with this before, but any help would be appreciated.
Sorry, I don’t know about your binary file.
Perhaps after you load your data, you can convert it to a numpy array so that you can provide it to a neural net?
Thanks for the tip, I’ll try it.
Thank you very very much for all your great tutorials.
If I wanted to add batch layer after the input layer, how should I do it?
Cuz I applied this tutorial on a different dataset and features and I think I need normalization or standardization and I want to do it the easiest way.
Thank you,
I recommend preparing the data prior to fitting the model.
thanks for sharing such nice tutorials, it helped me alot. i want to print the confusion matrix from the above example. and one more question.
if i have
20-input variable
1- class label (binary)
and 400 instances
how i would know , setting up the dense layer parameter in the first layer and hidden layer and output layer. like above example you have placed. 12,8,1
I recommend trial and error to configure the number of neurons in the hidden layer to see what works best for your specific problem.
C:\Users\zaheer\AppData\Local\Programs\Python\Python36\python.exe C:/Users/zaheer/PycharmProjects/PythonBegin/Bin-CLNCL-Copy.py
Using TensorFlow backend.
Traceback (most recent call last):
File “C:/Users/zaheer/PycharmProjects/PythonBegin/Bin-CLNCL-Copy.py”, line 28, in
model.fit(x_train , y_train , epochs=100, batch_size=100)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\models.py”, line 960, in fit
validation_steps=validation_steps)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py”, line 1574, in fit
batch_size=batch_size)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py”, line 1407, in _standardize_user_data
exception_prefix=’input’)
File “C:\Users\zaheer\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py”, line 153, in _standardize_input_data
str(array.shape))
ValueError: Error when checking input: expected dense_1_input to have shape (None, 20) but got array with shape (362, 1)
Ensure the input shape matches your data.
Dear Jason! Great job a very simple guide.
I am trying to run the exact code but there is an eror
str(array.shape))
ValueError: Error when checking target: expected dense_3 to have shape (None, 1) but got array with shape (768, 8)
How can I resolve.
I have windows 10 and spyder.
Sorry to hear that, perhaps confirm that you have the latest version of Numpy and Keras installed?
after run this code , i will calculate the accuracy , how i did , i
i want to split the data set into test data , training data
and evaluate the model and calculate the accuracy
thank dr.
In the model how many hidden layers are there ?
There are 2 hidden layers, 1 input layer and 1 output layer.
hi there. this blog is very awesome like the Adrian’s pyimagesearch blog. I have one question and that is do you have or will you have a tutorial on keras frame work with SSD or Yolo architechtures?
Thanks for the suggestion, I hope to cover them in the future.
Thanks for your awesome article.
I am really enjoying
‘Machine Learning Mastery’!!
Thanks!
Hello Jason!
This is an awesome article!
I am writing a report for a subject in university and I have used your code during my implementation, would it be possible to cite this post in bibtex?
Thank you!
Sure, you can cite the webpage directly.
My question is regarding predict. I used to get decimals in the prediction array. Suddenly, I started seeing only Integers (0 or 1) in the run. Any idea what could be causing the change?
predictions = model.predict(X2)
predictions
Out[3]:
array([[ 0.],
[ 0.],
[ 0.],
…,
[ 0.],
[ 0.],
[ 0.]], dtype=float32)
Perhaps check the activation function on the output layer?
# create model. Fully connected layers are defined using the Dense class
model = Sequential()
model.add(Dense(12, input_dim=len(x_columns), activation=’relu’)) #12 neurons, 8 inputs
model.add(Dense(8, activation=’relu’)) #Hidden layer with 8 neurons
model.add(Dense(1, activation=’sigmoid’)) #1 output layer. Sigmoid give 0/1
================== RESTART: /Users/apple/Documents/deep1.py ==================
Using TensorFlow backend.
Traceback (most recent call last):
File “/Users/apple/Documents/deep1.py”, line 20, in
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/models.py”, line 826, in compile
**kwargs)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/engine/training.py”, line 827, in compile
sample_weight, mask)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/engine/training.py”, line 426, in weighted
score_array = fn(y_true, y_pred)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/losses.py”, line 77, in binary_crossentropy
return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py”, line 3069, in binary_crossentropy
logits=output)
TypeError: sigmoid_cross_entropy_with_logits() got an unexpected keyword argument ‘labels’
>>>
I have not seem this error, sorry. Perhaps try posting to stack overflow?
Hello Mr.Janson
After installing Anaconda and deep learning libraries, I read your Free mini-course and I tried to write the code about the handwritten digit recognition.
I wrote the codes in jupyter notebook, am I right?
if not where should I write the codes ?
and if I want to use another dataset (my own data set) how can I use in the code?
and how can I see the result, for example the accuracy percentage?
I am really sorry for my simple questions! I have written a lot of code in “Matlab” but I am really a beginner in Python and Anaconda, my teacher force me to use Python and keras for my project.
thank you very much for your help
A notebook is fine.
You can write code in a Python script and then run the script directly.
Hello Mr.Janson again
I wrote the code below from your Free mini course for hand written digit recognition, but after running I faced the syntaxerror:
from keras.datasets import mnist
…
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
from keras.utils import np_utils
…
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
model = Sequential()
model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28),
activation=’relu’))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation=’relu’))
model.add(Dense(num_classes, activation=’softmax’))
model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
File “”, line 2
2 model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28),
^
SyntaxError: invalid syntax
would you please help me?!
thanks a lot
This:
should be:
Thank you for the awsome blog and explanations. I have just a question: How can we get predicted values by the model. . Many thanks
As follows:
Thank you for your prompt answer. I am trying to learn how keras models work and I used. I trained the model like this:
model.compile(loss=’mean_squared_error’, optimizer=’sgd’, metrics=[‘MSE’])
As output I have those lines
Epoch 10000/10000
10/200 [>………………………..] – ETA: 0s – loss: 0.2489 – mean_squared_error: 0.2489
200/200 [==============================] – 0s 56us/step – loss: 0.2652 – mean_squared_error: 0.2652
and my question what the difference between the two lines (MSE values)
They should be the same thing. One may be calculated at the end of each batch, and one at the end of each epoch.
hello
after running again it show an error:
NameError Traceback (most recent call last)
in ()
—-> 1 model = Sequential()
2 model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28), activation=’relu’))
3 model.add(MaxPooling2D(pool_size=(2, 2)))
4 model.add(Flatten())
5 model.add(Dense(128, activation=’relu’))
NameError: name ‘Sequential’ is not defined
You are missing the imports. Ensure you copy all code from the complete example at the end.
from keras.datasets import mnist
…
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
from keras.utils import np_utils
…
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
model = Sequential()
2 model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(1, 28, 28), activation=’relu’))
3 model.add(MaxPooling2D(pool_size=(2, 2)))
4 model.add(Flatten())
5 model.add(Dense(128, activation=’relu’))
6 model.add(Dense(num_classes, activation=’softmax’))
7 model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
hello
please tell me how can I find out that tensorflow and keras are correctly installed on my system.
maybe the problem is that, because no code runs in my jupyter. and no “import” acts well(for example import pandas)
thank you
See this post:
https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
Hi. I’m totally new to machine learning and I’m trying to wrap my head around it.
I have a problem I can’t quite solve yet. And don’t know where to start actually.
I have a dictionary with a few key:value pairs. The key is a random 4 digit number from 0000 to 9999. And the value for each key is set as follows: if a digit in a number is either 0, 6 or 9 then its weight is 1, if a digit is 8 then it’s weight is 2, any other digit has a weight of 0. All the weights are summarised then and here you have the value for the key. (example: { ‘0000’: 4, ‘1234’: 0, ‘1692’: 2, ‘8800’: 6} – and so on).
Now I’m trying to build a model that will predict the correct value of a given key. (i.e if I give it 2222 the answer is 0, if I give it 9011 – it’s 2). What I did first is created a CSV file with 5 columns, first four is a split (by a single digit) key from my dictionary, and the fifth column is the value for each key. Next I created a dataset and defined a model (like this tutorial but with input_dim=4). Now when I train the model the accuracy won’t go higher then ~30%. Also your model is based on binary output, whereas mine should have an integer from 0 to 8. Where do I go from here?
Thank you for all your effort in advance! 🙂
This post might help you nail down your problem as a predictive modeling problem:
https://machinelearningmastery.com/how-to-define-your-machine-learning-problem/
There is one thing I just dont get.
An example of row data is 6,148,72,35,0,33.6,0.627,50,1
I guess the number at the end is if the person has diabetes (1) or does not (0) , but what I dont understand is how I know the ‘prediction’is about that 0 or 1, tehere are a lot of other variables in the data, and I dont see ‘diabetes’ being a label for any of that.
So, how do I know or how do I set wich variable (number) I want to predict?
You interpret the prediction in your application or usage.
The model does not care what the inputs and outputs are, it does the best it can. It does not intrinsically care about diabetes.
hi,
@Jason Brownlee, Master of Keras Python.
I’m developing a face recognition testing, I successfully used Rprop, it was good for static images or face pictures, I also have test svm results.
What do you think in your experienced that Keras is better or powerful than Rprop?
because I was also thinking to used Keras(1:1) for final result of Rprop(1:many).
or which do you think is better system?
thanks in advance for the advices.
I also heard one of the leader of commercial face recognizers uses PNN(uses libopenblas), so I really doubt which one to choose for my final thesis and application.
What do you mean by rprop? I believe it is just an optimization algorithm, whereas Keras is a deep learning library.
https://en.wikipedia.org/wiki/Rprop
Ok, I think I understand you.
I used Accord.Net
Rprop testing was good
MLR testing was good
SVM testing was good
RBM testing was good
I used classification for face images
They are only good for static face pictures 100×100
but if I used another picture from them,
these 4 testing I have failed.
Do you think if I used Keras in image face recognition will have a good result or good prediction?
because if Keras will have a good result then I’ll have to used cesarsouza keras c#
https://github.com/cesarsouza/keras-sharp
thanks for the reply.
Try it and see.
What is the difference between the accuracy we get when we fit the model and the accuracy_score() of sklearn.metrics , what they mean exactly ?
Accuracy is a summary of the number of predictions that were made correctly out of all predictions that were made.
It is used as an estimate of model skill on new out of sample data.
is weather forecasting can done using RNN?
No. Weather forecasting is done with ensembles of physics simulations on very large computers.
we haven’t predicting anyting during the fit (its just a training , like mapping F(x)=Y)
but still getting acc , what is this acc?
Epoch 1/150
768/768 [==============================] – 1s 1ms/step – loss: 0.6771 – acc: 0.6510
Thank you in advance
Predictions are made as part of back propagating error.
Hi Jason,
Many thanks to you for a great tutorial. I have couple questions to you as followings.
1). How can I get the score of Prediction?
2). How can I output the result of predict run to a file in which the output is listed by vertical?
I see you everywhere to answer questions and help people. Your time and patience were greatly appreciated!
Charles
You can make predictions with a model as follows:
yhat = model.predict(X)
You can then save the numpy array result to file.
Hi I’ve just finished this tutorial but the only problem is what are we actually finding in the results as in what do accuracy and loss mean and what we are actually finding out.
I’m really new to the whole neural networks thing and don’t really understand them yet, I’d be very grateful if you’re able to reply
Many Thanks
Callum
Accuracy is the model skill in terms of the number of correct predictions divided by the total number of predictions.
Loss the function that the network is optimising, something differentiable and relatable to the metric of interest for the model, in this case logarithmic loss used for classification.
Hi Jason,
First of all congratulations for your awesome work, I finally got the hang of ML (hopefully, haha).
So, testing some changes in the number of neurons and batch size/epochs, I achieved 99.87% of accuracy.
The parameters I used were:
# create model
model = Sequential()
model.add(Dense(240, input_dim=8, init=’uniform’, activation=’relu’))
model.add(Dense(160, init=’uniform’, activation=’relu’))
model.add(Dense(1, init=’uniform’, activation=’sigmoid’))
# Compile model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# Fit the model
model.fit(X, Y, epochs=1500, batch_size=100, verbose=2)
And when I run it, I always get 99,87% of accuracy, which I think it’s a good thing, right? Please tell me if I did something wrong or if this is a false positive.
Thank you in advance and sorry for the bad english 😉
that accuracy is great, there will always be some error.
The above example is very good sir, I want to do price change prediction of electronics in online shopping project. Can you give any suggestions about my project. You had any example of price prediction using neural network please send a link sir.
I would recommend following this process:
https://machinelearningmastery.com/start-here/#process
Hi, very helpful example. But I still don’t understand why you load
X = dataset[:,0:8]
Y = dataset[:,8]
If I do
X = dataset[:,0:7] it won’t work
You can learn more about indexing and slicing numpy arrays here:
https://machinelearningmastery.com/index-slice-reshape-numpy-arrays-machine-learning-python/
Thank you for the tutorial.
Perhaps, someone already told you this. The data set is no longer available.
Thanks for the note, I’ll fix that up ASAP.
Thanks very much for the concise example! As an “interested amateur” with more experience coding for scientific data manipulation than for software development, a simple, high-level explanation like this one is much appreciated. I find sometimes that documentation pages can be a bit low-level for my liking, even with coding experience multiple languages. This article was all I needed to get started, and was much more helpful than other “official tutorials.”
Thanks, I’m glad to hear that Wesley.
Thank you for your tutorial, but the data set is not accessible. Could you please fix it.
Thanks, I’ll fix it.
hello
I have found a code to converting my image data to mnist format . but I face to an error below.
would you please help me?
import os
from PIL import Image
from array import *
from random import shuffle
# Load from and save to
Names = [[‘./training-images’,’train’], [‘./test-images’,’test’]]
for name in Names:
data_image = array(‘B’)
data_label = array(‘B’)
FileList = []
for dirname in os.listdir(name[0])[1:]: # [1:] Excludes .DS_Store from Mac OS
path = os.path.join(name[0],dirname)
for filename in os.listdir(path):
if filename.endswith(“.png”):
FileList.append(os.path.join(name[0],dirname,filename))
shuffle(FileList) # Usefull for further segmenting the validation set
for filename in FileList:
label = int(filename.split(‘/’)[2])
Im = Image.open(filename)
pixel = Im.load()
width, height = Im.size
for x in range(0,width):
for y in range(0,height):
data_image.append(pixel[y,x])
data_label.append(label) # labels start (one unsigned byte each)
hexval = “{0:#0{1}x}”.format(len(FileList),6) # number of files in HEX
# header for label array
header = array(‘B’)
header.extend([0,0,8,1,0,0])
header.append(int(‘0x’+hexval[2:][:2],16))
header.append(int(‘0x’+hexval[2:][2:],16))
data_label = header + data_label
# additional header for images array
if max([width,height]) <= 256:
header.extend([0,0,0,width,0,0,0,height])
else:
raise ValueError('Image exceeds maximum size: 256×256 pixels');
header[3] = 3 # Changing MSB for image data (0x00000803)
data_image = header + data_image
output_file = open(name[1]+'-images-idx3-ubyte', 'wb')
data_image.tofile(output_file)
output_file.close()
output_file = open(name[1]+'-labels-idx1-ubyte', 'wb')
data_label.tofile(output_file)
output_file.close()
# gzip resulting files
for name in Names:
os.system('gzip '+name[1]+'-images-idx3-ubyte')
os.system('gzip '+name[1]+'-labels-idx1-ubyte')
FileNotFoundError Traceback (most recent call last)
in ()
13
14 FileList = []
—> 15 for dirname in os.listdir(name[0])[1:]: # [1:] Excludes .DS_Store from Mac OS
16 path = os.path.join(name[0],dirname)
17 for filename in os.listdir(path):
FileNotFoundError: [WinError 3] The system cannot find the path specified: ‘./training-images’
Looks like the code cannot find your images. Perhaps change the path in the code?
Thanks a lot sir, this was a very good and intuitive tutorial
Thanks, I’m glad it helped.
I got a prediction model running successfully for fraud detection. My dataset is over 50 million and growing. I am seeing a peculiar issue.
When the loaded data is 10million or less, My prediction is OK.
As soon as I load 11 million data, My prediction saturates to a particular (say 0.48) and keeps on repeating. That is all predictions will be 0.48, irrespective of the input.
I have tried will multiple combinations of the dense model.
# create model
model = Sequential()
model.add(Dense(32, input_dim=4, activation=’tanh’))
model.add(Dense(28, activation=’tanh’))
model.add(Dense(24, activation=’tanh’))
model.add(Dense(20, activation=’tanh’))
model.add(Dense(16, activation=’tanh’))
model.add(Dense(12, activation=’tanh’))
model.add(Dense(8, activation=’tanh’))
model.add(Dense(1, activation=’sigmoid’))
Perhaps check whether you need to train on all data, often a small sample is sufficient.
Oh. I believe that the machine learning accuracy will improve as we get more data over time.
HI,
How do you define number of hidden layers and neurons per layer?
There are no good heuristics, trial and error is a good approach. Discover what works best for your specific data.
I executed the code and got the output, but how to use this prediction in the application.
Depends on the application.