Use Keras Deep Learning Models with Scikit-Learn in Python

Keras is one of the most popular deep learning libraries in Python for research and development because of its simplicity and ease of use.

The scikit-learn library is the most popular library for general machine learning in Python.

In this post you will discover how you can use deep learning models from Keras with the scikit-learn library in Python.

This will allow you to leverage the power of the scikit-learn library for tasks like model evaluation and model hyper-parameter optimization.

Let’s get started.

  • Update: For a larger example of tuning hyperparameters with Keras, see the post:
  • Update Oct/2016: Updated examples for Keras 1.1.0 and scikit-learn v0.18.
  • Update Jan/2017: Fixed a bug in printing the results of the grid search.
  • Update Mar/2017: Updated example for Keras 2.0.2, TensorFlow 1.0.1 and Theano 0.9.0.
  • Update Mar/2018: Added alternate link to download the dataset as the original appears to have been taken down.
Use Keras Deep Learning Models with Scikit-Learn in Python

Use Keras Deep Learning Models with Scikit-Learn in Python
Photo by Alan Levine, some rights reserved.


Keras is a popular library for deep learning in Python, but the focus of the library is deep learning. In fact it strives for minimalism, focusing on only what you need to quickly and simply define and build deep learning models.

The scikit-learn library in Python is built upon the SciPy stack for efficient numerical computation. It is a fully featured library for general machine learning and provides many utilities that are useful in the development of deep learning models. Not least:

  • Evaluation of models using resampling methods like k-fold cross validation.
  • Efficient search and evaluation of model hyper-parameters.

The Keras library provides a convenient wrapper for deep learning models to be used as classification or regression estimators in scikit-learn.

In the next sections, we will work through examples of using the KerasClassifier wrapper for a classification neural network created in Keras and used in the scikit-learn library.

The test problem is the Pima Indians onset of diabetes classification dataset. This is a small dataset with all numerical attributes that is easy to work with. Download the dataset and place it in your currently working directly with the name pima-indians-diabetes.csv (update: download from here).

The following examples assume you have successfully installed Keras and scikit-learn.

Need help with Deep Learning in Python?

Take my free 2-week email course and discover MLPs, CNNs and LSTMs (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

Evaluate Deep Learning Models with Cross Validation

The KerasClassifier and KerasRegressor classes in Keras take an argument build_fn which is the name of the function to call to get your model.

You must define a function called whatever you like that defines your model, compiles it and returns it.

In the example, below we define a function create_model() that create a simple multi-layer neural network for the problem.

We pass this function name to the KerasClassifier class by the build_fn argument. We also pass in additional arguments of nb_epoch=150 and batch_size=10. These are automatically bundled up and passed on to the fit() function which is called internally by the KerasClassifier class.

In this example, we use the scikit-learn StratifiedKFold to perform 10-fold stratified cross-validation. This is a resampling technique that can provide a robust estimate of the performance of a machine learning model on unseen data.

We use the scikit-learn function cross_val_score() to evaluate our model using the cross-validation scheme and print the results.

Running the example displays the skill of the model for each epoch. A total of 10 models are created and evaluated and the final average accuracy is displayed.

Grid Search Deep Learning Model Parameters

The previous example showed how easy it is to wrap your deep learning model from Keras and use it in functions from the scikit-learn library.

In this example, we go a step further. The function that we specify to the build_fn argument when creating the KerasClassifier wrapper can take arguments. We can use these arguments to further customize the construction of the model. In addition, we know we can provide arguments to the fit() function.

In this example, we use a grid search to evaluate different configurations for our neural network model and report on the combination that provides the best-estimated performance.

The create_model() function is defined to take two arguments optimizer and init, both of which must have default values. This will allow us to evaluate the effect of using different optimization algorithms and weight initialization schemes for our network.

After creating our model, we define arrays of values for the parameter we wish to search, specifically:

  • Optimizers for searching different weight values.
  • Initializers for preparing the network weights using different schemes.
  • Epochs for training the model for a different number of exposures to the training dataset.
  • Batches for varying the number of samples before a weight update.

The options are specified into a dictionary and passed to the configuration of the GridSearchCV scikit-learn class. This class will evaluate a version of our neural network model for each combination of parameters (2 x 3 x 3 x 3 for the combinations of optimizers, initializations, epochs and batches). Each combination is then evaluated using the default of 3-fold stratified cross validation.

That is a lot of models and a lot of computation. This is not a scheme that you want to use lightly because of the time it will take. It may be useful for you to design small experiments with a smaller subset of your data that will complete in a reasonable time. This is reasonable in this case because of the small network and the small dataset (less than 1000 instances and 9 attributes).

Finally, the performance and combination of configurations for the best model are displayed, followed by the performance of all combinations of parameters.

This might take about 5 minutes to complete on your workstation executed on the CPU (rather than CPU). running the example shows the results below.

We can see that the grid search discovered that using a uniform initialization scheme, rmsprop optimizer, 150 epochs and a batch size of 5 achieved the best cross-validation score of approximately 75% on this problem.


In this post, you discovered how you can wrap your Keras deep learning models and use them in the scikit-learn general machine learning library.

You can see that using scikit-learn for standard machine learning operations such as model evaluation and model hyperparameter optimization can save a lot of time over implementing these schemes yourself.

Wrapping your model allowed you to leverage powerful tools from scikit-learn to fit your deep learning models into your general machine learning process.

Do you have any questions about using Keras models in scikit-learn or about this post? Ask your question in the comments and I will do my best to answer.

Frustrated With Your Progress In Deep Learning?

Deep Learning with Python

 What If You Could Develop A Network in Minutes

…with just a few lines of Python

Discover how in my new Ebook: Deep Learning With Python

It covers self-study tutorials and end-to-end projects on topics like:
Multilayer PerceptronsConvolutional Nets and Recurrent Neural Nets, and more…

Finally Bring Deep Learning To
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.

149 Responses to Use Keras Deep Learning Models with Scikit-Learn in Python

  1. Shruthi June 2, 2016 at 6:38 am #

    First, this is extremely helpful. Thanks a lot.

    I’m new to keras and i was trying to optimize other parameters like dropout and number of hidden neurons. The grid search works for the parameters listed above in your example. However, when i try to optimize for dropout the code errors out saying it’s not a legal parameter name. I thought specifying the name as it is in the create_model() function should be enough; obviously I’m wrong.

    in short: if i had to optimize for dropout using GridSearchCV, how would the changes to your code look?

    apologies if my question is naive, trying to learn keras, python and deep learning all at once. Thanks,


    • Jason Brownlee June 23, 2016 at 10:20 am #

      Great question.

      As you say, you simply add a new parameter to the create_model() function called dropout_rate then make use of that parameter when creating your dropout layers.

      Below is an example of grid searching dropout values in Keras:

      Running the example produces the following output:

      I hope that helps

  2. Rish June 22, 2016 at 10:01 pm #

    Hi Jason,

    Thanks for the post, this is awesome. I’ve found the grid search very helpful.

    One quick question: is there a way to incorporate early stopping into the grid search? With a particular model I am playing with, I find it can often over-train and consequently my validation loss suffers. Whilst I could incorporate an array of epoch parameters (like in your example), it seems more efficient to just have it stop if the validation accuracy increases over a small number of epochs. Unless you have a better idea?

    Thanks again!

    • Jason Brownlee June 23, 2016 at 5:30 am #

      Great comment Rish and really nice idea.

      I don’t think scikit-learn supports early stopping in it’s parameter searching. You would have to rig up your own parameter search and add an early stop clause to it, or consider modifying sklearn itself.

      I also wonder whether you could hook-up Kears check-pointing and capture the best parameter combinations along the way to file (checkpoint callback) and allow you to kill the search at any time.

      • Rish July 19, 2016 at 12:06 pm #

        Thanks for the reply! 🙂 I wonder too!

        • Vadim March 13, 2017 at 8:16 am #

          Hey Jason,

          Awesome article. Did you by any chance find a way to hookup callback functions to grid search? In this case it would be possible to have Tensorboard aggregating and visualizing the grid search outcomes.

  3. Rishabh August 17, 2016 at 3:13 am #

    Hi Jason,

    Great Post. Thanks for this.

    One problem that i have always faced with training a deep learning model (in H2O as well) is that predicted probability distribution is always flat (as in very less variation in probability across the sample). Any other ML model e.g. RF/GBM is easier to tune and gives good results in most cases. So the my doubt is two fold:

    1. Except for lets say image data where CNN might be a good thing to try, in what scenarios should we try to fit a deep learning model.
    2. I think the issues that i face with deep learning models is usually due to underfitting. Can you please give some tips on how to tune a deep learning model (other ML models are easier to tune)


    • Jason Brownlee August 17, 2016 at 9:53 am #

      Deep learning is good on raw data like images, text, audio and similar. It can work on complex tabular data, but often feature engineering + xgboost can do better in practice.

      Keep adding layers/neurons (capacity) and training longer until performance flattens out.

      • Rishabh August 17, 2016 at 7:51 pm #

        Thanks. I will try to add more layers and add some regularization parameters as well

        • Jason Brownlee August 18, 2016 at 7:16 am #

          Good luck Rishabh, let me know how you go.

          • Rishabh September 10, 2016 at 11:26 pm #

            Hi Jason, I was able to tune the model, thanks to your post although GBM had a better fit. I had included prelu layers which improved the fit.

            One question, is there an optimal way to find number of hidden neurons and hidden layer count or grid search the only option?

          • Jason Brownlee September 12, 2016 at 8:27 am #

            Tuning (grid/random) is the best option I know of Rishabh.

          • Rishabh September 12, 2016 at 5:03 pm #

            Thanks. Grid search takes a toll on my 16 GB laptop, hence searching for an optimal way.

          • Jason Brownlee September 13, 2016 at 8:10 am #

            It is punishing. Small grids are kinder on memory.

  4. xiao September 17, 2016 at 10:59 pm #

    Hi Jason,
    Thanks for the post. I’m a beginner about keras, and I met some problems using keras with sk-learn recently. If it’s convenient, could you do me a falvor?

    Details here:

    Thank you!!!

    • Jason Brownlee September 18, 2016 at 8:01 am #

      Ouch. Nothing comes to mind Xiao, sorry.

      I would debug it by cutting it back to the simplest possible network/code that generates the problem then find the line that causes the problem, and go from there.

      If you need a result fast, I would use the same decompositional approach to rapidly find a workaround.

      Let me know how you go.

      • xiao September 18, 2016 at 10:43 am #

        Thanks a lot, Jason.
        I need the result fast if possible.

      • xiao September 21, 2016 at 12:22 pm #

        Hi, Jason. Got any idea?

        • Jason Brownlee September 22, 2016 at 8:06 am #

          Yes, I gave you an approach to debug the problem in the previous comment Xiao.

          I don’t know any more than that, sorry.

          • xiao September 22, 2016 at 4:52 pm #

            Thanks, I’ll try it out.

  5. Josh September 22, 2016 at 12:12 am #

    Jason, Thanks for the tutorial it saved me a lot of time. I am running a huge amount of data on a remote server from shell files. The output of the model is written to an additional shell file in case there is errors. However when I run my code, following how you approach above, it outputs the status of training, i.e. epoch number and accuracy, for every model in gridsearch. Is there a way to suppress this output? I tried to use “verbose=0” as an additional argument both in calling “fit” which created an error and GridsearchCV which did not do anything.


    • Jason Brownlee September 22, 2016 at 8:17 am #

      Great question Josh.

      Pass verbose=0 into the constructor of your classifier:

  6. Tom October 5, 2016 at 9:09 pm #

    Hello Jason,
    First of all thank you so much for your guides and examples regarding Keras and deep learning!! Please keep on going 🙂
    Question 1
    Is it possible to save the best trained model with grid and set some callbacks (for early stopping as well)? I wanted to implement saving the best model by doing
    checkpoint=ModelCheckpoint(filepath, monitor=’val_acc’, verbose=0, save_best_only=True, mode=’max’)
    grid_result =, Y, callbacks=[checkpoint])
    but TypeError: fit() got an unexpected keyword argument ‘callbacks’
    Question 2
    Is there a way to visualize the trained weights and literaly seeing the created network? I want to make neural networks a bit more practical instead of only classification. I know you can plot the model: plot(model, to_file=’model.png’) but I want to implement my data in this model.
    Thanks in advance,

    • Jason Brownlee October 6, 2016 at 9:37 am #

      Hi Tom, I’m glad you’re finding my material valuable.

      Sorry, I don’t know about call-backs in the grid search. Sounds like it might just make a big mess (i.e. not designed to do this).

      I don’t know about built in ways to visualize the weights. Personally, I like to look at the weights for production models to see if something crazy is going on – but just the raw numbers.

      • David August 22, 2017 at 4:46 pm #

        Hi Jason,
        Thanks for the very well written article! Very clear and easy to understand.
        I have a callback that does different types of learning rate annealing. It has four parameters I’d like to optimise.
        Based on your above comment I’m guessing the SciKit wrapper won’t work for optimising this?
        Do you know how I would do this?
        Many thanks for your help.

        • Jason Brownlee August 23, 2017 at 6:41 am #

          You might have to write your own for loop David. In fact, I’d recommend it for the experience and control it offers.

  7. Soren Pallesen October 18, 2016 at 5:08 pm #

    Hi there, thanks for all your inspiration. When running the above example i get a slightly different result:

    Best: 0.751302 using {‘optimizer’: ‘rmsprop’, ‘batch_size’: 5, ‘init’: ‘normal’, ‘nb_epoch’: 150}

    e.i. init: normal and not uniform as in your example.

    Is it normal with these variations?

    • Jason Brownlee October 19, 2016 at 9:16 am #

      Hi Soren,

      Results can vary based on the initialization method. It is hard to predict how they will vary for a given problem.

  8. Saddam December 15, 2016 at 9:20 pm #

    Hey am getting an error “ValueError: init is not a legal parameter”
    code is as follows.
    init = [‘glorot_uniform’, ‘normal’, ‘uniform’]
    batches = numpy.array([50, 100, 150])
    param_grid = dict(init=init)
    grid = GridSearchCV(estimator=self.model_comp, param_grid=param_grid)
    grid_result =, Y_train)

    i can’t figure out what am doing wrong. Help me out here.

    • Jason Brownlee December 16, 2016 at 5:41 am #

      Sorry Saddam, the cause does not seem obvious.

      Perhaps post more of the error message or consider posting the question on stack overflow?

      • Palash Goyal January 15, 2017 at 11:36 pm #

        Hi Saddam and Jason,

        The error is due to the init parameter ‘glorot_uniform’.
        Seems like it has been deprecated or something, once you remove this from the possible values (i.e., init=[‘uniform’,’normal’]) your code will work.


        • Jason Brownlee January 16, 2017 at 10:41 am #

          It does not appear to be deprecated:

          • Nils Holgersson January 23, 2017 at 4:11 am #

            Don’t forget to add the parameter (related to parameters inside the creation of your model) that you want to iterate over as input parameter to the function that creates your model.

            def create_model(init=’normal’):

    • Fareed February 8, 2018 at 11:27 pm #

      check you have defined the ‘init’ as parameter in the function. if not define it as parameter in the function. once done, when you create the instance of KerasClassifier and call that in the GRidSearchCV(), it will not throw the error.

  9. Tameru December 21, 2016 at 9:44 am #

    Hi Jason,

    recently, I tried to use K-fold cross validation for Image classification problem and found the following error

    training Image X shape= (2041,64,64)
    label y shape= (2041,2)


    model = KerasClassifier(build_fn=creat_model, nb_epoch=15, batch_size=10, verbose=0)

    # evaluate using 6-fold cross validation

    kfold = StratifiedKFold(n_splits=6, shuffle=False, random_state=seed)

    results = cross_val_score(model, x, y, cv=kfold)

    print results

    print ‘Mean=’,


    an error:

    IndexError Traceback (most recent call last)
    in ()
    2 # evaluate using 6-fold cross validation
    3 kfold = StratifiedKFold(n_splits=6, shuffle=False, random_state=seed)
    —-> 4 results = cross_val_score(model, x, y, cv=kfold)
    5 print results
    6 print ‘Mean=’,

    IndexError: too many indices for array

    I don’t understand what is wrong here?

    • Jason Brownlee December 22, 2016 at 6:29 am #

      Sorry Tameru, I have not seen this error before. Perhaps try stack overflow?

    • coyan January 5, 2017 at 6:50 pm #

      Hi, I have the same problem. Do you have any easy way to solve it?


      • Jason Brownlee January 6, 2017 at 9:07 am #

        Sounds like it could be a data issue.

        Perhaps your data does not have enough observations in each class to split into 6 groups?

        – Try different values of k.
        – Try using KFold rather than StratifiedKFild
        – Try a train/test split

        • Tameru January 8, 2017 at 7:04 am #

          @ Coyan, I am still trying to solve it. let us try Jason’s advice. Thanks, Jason.

    • JiaMingLin January 20, 2017 at 8:47 pm #


      I also encounter this problem, and I guess the scikit-learn k-fold functions do not accept the “one hot” vectors. You may try on StratifiedShuffleSplit with the list of “one hot” vectors.

      And that means you can only evaluate Keras model with scikit-learn in the binary classification problem.

  10. Ali January 15, 2017 at 7:52 pm #

    Dear Dr. Brownlee,

    Your results here are around 75%. My experiment with my data result in around 85%. Is this considered good result?

    Because DNN and RNN are known for their great performances. I wonder if it’s normal and how we can improve the results.


  11. Miriam February 3, 2017 at 1:01 am #

    Hi Jason,

    I cannot tell you how awesome your tutorials are in terms of saving me time trying to understand Keras,

    However, I have run into a conceptual wall re: training, validation and testing. I originally understood that wrapping Keras in Gridsearch helped me tune my hyperparameters. So with GridsearchCV, there is no separate training and validation sets. This I can live with as it is the case with any CV.

    But then I want to use Keras to predict my outcomes on the model with optimized hyperparameters. Every example I see for, uses the argument validation_data (or validation_split) and I’m understand that we’re using our test set as a validation set — a real no no.

    Please see for a discussion of this and proof that I am not theonly one confused.

    SO MY QUESTION IS: In completing your wonderful cookbook how to’s for novices, after I have found all my hyperparameters, how do I run my test data?

    If I use, won’t the test data be unlawfully used to retrain? What exactly is happening with the validation_data or _split argument in in keras????

    • Jason Brownlee February 3, 2017 at 10:05 am #

      Hi Miriam,

      Generally, you can hold out a separate validation set for evaluating a final model and set of parameters. This is recommended.

      Model selection and tuning can be performed on the same test set using a suitable resampling method (k-fold cross validation with repeats). Ideally, separate datasets would be used for algorithm selection and parameter selection, but is often too expensive (in terms available data).

  12. wenger March 6, 2017 at 5:12 pm #

    i follow the step ,but error

    ValueError: optimizer is not a legal parameter

    i don’t konw how to deal with it

    • Jason Brownlee March 7, 2017 at 9:35 am #

      I’m sorry to hear that wenger. Perhaps confirm that you have the latest versions of Keras and sklearn installed?

      • Pedro Cadahia February 5, 2018 at 9:19 pm #

        Hi wenger, that error is produced because there is an error of coding in the example shown below. you should change the function and pass it an argument called optimizer, like this create_model(optimizer), also in the same function you have to edit the model compile and change the optimized (which is fixed) that is wrong if you want to do a grid search… so a good solution would be change ‘adam’ and write again ‘optimizer=optimizer’, by doing this you would be able to run the code again and find the best solver.

        Happy deep learning!

    • Xian Jing February 2, 2018 at 12:11 am #

      I have met your problem, and I find that maybe you haven’t transmit the optimizer to the function model.

  13. myk March 31, 2017 at 9:25 pm #

    hello how can i implement SVM machine learning algorithm by using scikit library with keras

    • Jason Brownlee April 1, 2017 at 5:54 am #

      Keras is for deep learning, not SVM. You only need sklearn.

  14. Ronen April 3, 2017 at 10:02 pm #

    Hi Jason,

    As always very informative and skillfully written post.

    In the face of an extremely unbalanced data set, how would pipeline under-sampling pre-processing step, in the example above.

    Thanks !

  15. Jens April 16, 2017 at 6:54 am #

    Hey Jason,

    I played around with your code for my own project and encountered an issue that I get different results when using the pipeline (both without standardization)
    I posted this also on crossvalidated:

    Can you help me with that?
    Thank you

  16. Carlton banks April 30, 2017 at 11:15 am #

    Could you provide an example in which data generator and fit_generator is being used..

  17. Anni May 3, 2017 at 7:59 pm #

    Hi Jason,

    Thanks for the post, this is awesome. But i’m facing

    ValueError: Can’t handle mix of multilabel-indicator and binary

    when i set scoring function to precision, recall, or f1 in cross validation. But it works fine if i didn’t set scoring function just like you did. Do you have any easy way to solve it?

    Big Thanks!

    Here’s the code :
    scores2=cross_val_score(model, X_train.as_matrix(), y_train, cv=10, scoring=’precision’)

    • Jason Brownlee May 4, 2017 at 8:06 am #

      Is this happening with the dataset used in this tutorial?

      • Anni May 5, 2017 at 4:57 am #

        No, actually its happening with breast cancer data set from UCI Machine Learning. I used train_test_split to split the data into training and testing. But when I tried to fit the model, i got

        IndexError: indices are out-of-bounds.

        So I tried to modify y_train by following code :

        y_train = np_utils.to_categorical(y_train)

        Do you have any idea ? i tried to solve this error for a week and still cannot fixed the problem.

        • Jason Brownlee May 5, 2017 at 7:34 am #

          I believe all the variables in that dataset are categorical.

          I expect you will need to use an integer encoding and a one hot encoding for each variable.

          • Anni May 5, 2017 at 10:24 pm #

            Okay, I will try your suggestion. Thanks for your reply 🙂

  18. Edward May 12, 2017 at 7:24 am #

    Jason is there something in deep learning like feature_importance in xgboost?
    For images it makes no sense though in this case it can be important

    • Jason Brownlee May 12, 2017 at 7:52 am #

      There may be, I’m not aware of it.

      You could use a neural net within an RFE process.

      • Edward May 12, 2017 at 7:59 am #

        Thanks for your reply, Jason! Your blog is GOLD 🙂

  19. Adrian May 26, 2017 at 12:04 am #

    Do you know how to save the hyperparameters with the TensorBoard callback?

  20. Anisa May 28, 2017 at 6:12 pm #

    Hi Jason!

    I’m doing grid search with my own scoring function, but I need to get result like accuracy and recall from training model. So, I use cross_val_score with best params that I get from grid search. But then cross_val_score produce different result with best score that I got from grid search. Do you have any idea to solve my problem?


  21. Shabran May 30, 2017 at 3:41 pm #

    Hi Jason!

    I’m doin grid search with my own function as scoring function, but i need to reports other metrics from best param that I got from grid search. So, I’m doin cross validation with best params. But the problem is cross validation produce different result with best score from grid search. The different really significan. Do you have any idea to solve this problem?


  22. Szymon June 6, 2017 at 6:52 am #

    How could we use Keras with ensembles, let’s take Voting. When base models are from sklearn library everything works fine, but with Keras I’m getting ‘TypeError: cannot create ‘sys.flags’ instances’. Do you know any work around?

    • Jason Brownlee June 6, 2017 at 10:09 am #

      You can perform voting manually on the lists of predictions from each sub model.

  23. Kirana June 14, 2017 at 3:12 pm #

    I’m evaluating 3 algorithms (SVM-RBF, XGBoost, and MLP) against small datasets. Is it true if SVM and XGBoost are suitable for small datasets whereas deep learning require “relatively” large datasets to work well? Could you please explain to me?

    Thanks a lot.

    • Jason Brownlee June 15, 2017 at 8:42 am #

      Perhaps, let the results and hard data drive your model selection.

  24. James June 17, 2017 at 2:55 am #

    Great article. It seems similar to what I am currently learning about. I’m using the sklearn wrapper for Keras; particularly the KerasClassifier and sklearn’s cross_val_score(). I am running into an issue with the n_jobs parameter for cross_val_score. I’d like to take advantage of the GPU on my Windows 10 machine, but am getting Blas GEMM launch failed when n_jobs = -1 or any value above 1. Also getting different results if I run it from the shell prompt vs. the python interpreter. Any ideas what I can do to get this to work?

    Windows 10
    Geforce 1080 Ti
    Tensorflow GPU
    Python 3.6 via Anaconda
    Keras 2.0.5

    • Jason Brownlee June 17, 2017 at 7:33 am #

      Sorry James, I don’t have good advice on setting up GPUs on windows for Keras.

      Perhaps you can post on stackoverflow?

  25. Anastasios Selalmazidis June 18, 2017 at 2:17 am #

    Hello Jason,

    when I run your first example, the one which uses StratifiedKFold, with a multiclass dataset of mine, I get an errorIsn’t it possible to run StratifiedKFold with multiclass ? I also have the same problem “”IndexError: too many indices for array”. when try to run a GridSearch with StritifiedKFold

  26. Sean June 27, 2017 at 5:24 pm #

    Hi Jason,
    I follow your steps and the program takes more than 40 minutes to run. Therefore, I gave up waiting in the middle. Do you know if there is any way to speed up the GridSearchCV? Or is it normal to wait more than 40 minutes to run this code on a 2013 Mac? Thank you

    • Jason Brownlee June 28, 2017 at 6:18 am #

      You could cut down the number of parameters that are being searched.

      I often run grid searches that run for weeks on AWS.

  27. Clarence Wong June 27, 2017 at 8:31 pm #

    Hi Jason,

    How do I save the model when wrapping a Keras Classifier in Scikit’s GridSearchCV?

    Do I treat it as a scikit object and use pickle/joblib or do I use the method native to Keras?

    • Jason Brownlee June 28, 2017 at 6:23 am #

      Sorry, I have not tried this. It may not be supported out of the box.

  28. Karthik July 15, 2017 at 11:39 am #

    Hi Jason,

    I want to perform stratified K-fold cross-validation with a model that predicts a distance and not a binary label. Is it possible to provide a distance threshold to the cross-validation method in scikit-learn (or is there some other approach), i.e., distance < 0.5 to be treated as a positive label (y=1) and a negative label (y=0) otherwise?

    • Jason Brownlee July 16, 2017 at 7:56 am #

      Stratification requires a class value.

      Perhaps you can frame your problem so the outcome distances are thresholded class labels.

      Note, this is a question of how you frame your prediction problem and prepare your data, not the sklearn library.

  29. Merco July 15, 2017 at 11:12 pm #

    Very cool!! And you can use Recursive Feature Elimination sklearn function by this way too?

  30. ambika July 19, 2017 at 4:33 pm #

    Please could you tell me formula for relu function , i need it for regression.

  31. ds July 21, 2017 at 7:34 pm #

    Is it possible to use your code with more complex model architectures? I have 3 different inputs and branches (with ConvLayers), which are concatenated at the end to form a dense layer.
    I tried to call the function as follows:
    grid_result ={‘input_1’: train_input_1, ‘input_2’: train_input_2, ‘input_3’: train_input_3}, {‘main_output’: lebels_train})

    I’m getting an error:
    “ValueError: found input variables with inconsistent numbers of samples: [3, 1]”

    Do you have any experience on that?

    • Jason Brownlee July 22, 2017 at 8:32 am #

      I don’t sorry.

      I would recommend using native Keras.

  32. Nas October 10, 2017 at 4:23 pm #

    from keras.models import Sequential
    from keras.layers import Dense
    from keras.wrappers.scikit_learn import KerasClassifier
    from sklearn.model_selection import StratifiedKFold
    from sklearn.model_selection import cross_val_score
    import numpy

    def create_model():
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation=’relu’))
    model.add(Dense(8, activation=’relu’))
    model.add(Dense(1, activation=’sigmoid’))
    # Compile model
    model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
    return model

    seed = 7
    dataset = numpy.loadtxt(“/home/nasrin/nslkdd/NSL_KDD-master/KDDTrain+.csv”, delimiter=”,”)
    X = dataset[:,0:41]
    Y = dataset[:,41]

    model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)
    kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
    results = cross_val_score(model, X, Y, cv=kfold)

    Using TensorFlow backend.
    Traceback (most recent call last):
    File “”, line 20, in
    dataset = numpy.loadtxt(“/home/nasrin/nslkdd/NSL_KDD-master/KDDTrain+.csv”, delimiter=”,”)
    File “/home/nasrin/.local/lib/python3.5/site-packages/numpy/lib/”, line 1024, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]
    File “/home/nasrin/.local/lib/python3.5/site-packages/numpy/lib/”, line 1024, in
    items = [conv(val) for (conv, val) in zip(converters, vals)]
    File “/home/nasrin/.local/lib/python3.5/site-packages/numpy/lib/”, line 725, in floatconv
    return float(x)
    ValueError: could not convert string to float: b’tcp’
    sorry to bother you, what step i missed here.

  33. Piotr October 12, 2017 at 10:33 pm #

    Do you know, how to do K.clear_session() inside cross_val_score(), between folds?

    I have huge CNN networks, but they fits in my memory when I do just one training. The problem is, when I do cross-validation using sci-kit learn and cross_val_score function. I see that memory is increasing with each fold. Do you know hot to change that? After all, after each fold we have to remember just results, not huge model with all weights.

    I’ve tried to use on_train_end callback from keras, but this doesn’t work as model is wiped out before evaluating. So do you know if exists other solution? Unfortunately I don’t see any callbacks in cross_val_score function…

    I will be very glad for your help 🙂

  34. Sanaz October 25, 2017 at 10:17 pm #

    So many thanks for such a helpful code! I have a problem, despite defining the init both in the model and the dictionary in the same way that you have defined it here, I get an error:

    ‘{} is not a legal parameter’.format(params_name))

    ValueError: init is not a legal parameter

    Could you please help me with this problem?

  35. Ab November 9, 2017 at 9:31 pm #

    Thank for your nice paper.
    I have tried to use AdaBoostClassifier( model, n_estimators=2, learning_rate=1.5, algorithm=”SAMME”)
    and used CNN as ‘model’. However I get the following error:

    File “”, line 1, in
    runfile(‘/home/aboozar/Sphere_FE/adaboost/’, wdir=’~/adaboost’)

    File “/usr/local/lib/python2.7/dist-packages/spyder/utils/site/”, line 688, in runfile
    execfile(filename, namespace)

    File “/usr/local/lib/python2.7/dist-packages/spyder/utils/site/”, line 93, in execfile
    builtins.execfile(filename, *where)

    File “/~/”, line 234, in, y_train)

    File “/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/”, line 413, in fit
    return super(AdaBoostClassifier, self).fit(X, y, sample_weight)

    File “/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/”, line 130, in fit

    File “/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/”, line 431, in _validate_estimator
    % self.base_estimator_.__class__.__name__)

    ValueError: KerasClassifier doesn’t support sample_weight.

    Do you have any advice?

  36. Arooj December 15, 2017 at 5:14 pm #

    I am new to keras. Just copied the above code with grid and executed it but getting this error:

    “ValueError: init is not a legal parameter”

    Your guidance is appreciated.

    • Jason Brownlee December 16, 2017 at 5:24 am #

      Double check that you copied all of the code with the same spacing.

      Also double check you have the latest version of Keras and sklearn installed.

  37. michele December 18, 2017 at 10:20 pm #

    Hi, great tutorial, so thanks!

    I have a question: if I want to use KerasClassifier with my own score function, let’s say maximizing F1, instead of accuracy, in a grid search scenario or similar.
    What should I do?


    • Jason Brownlee December 19, 2017 at 5:19 am #

      You can specify the scoring function used by sklearn when evaluating the model with CV or what have you, the “scoring” attribute.

      • michele December 20, 2017 at 12:50 am #

        Thx for the answer. But, if I understood correctly, the score function of kerasclassifier must be differentiable, since it is used also as loss function, and F1 is not.

        • Jason Brownlee December 20, 2017 at 5:46 am #

          No, that is the loss function of the keras model itself, a different thing from the sklearn’s evaluation of model predictions.

  38. Ankur Singh December 19, 2017 at 3:24 pm #

    Hi Jason, your blog is amazing. I have your Deep Learning book as well.

    I have a question: I don’t want the stratified KFold in my code. I have my own validation data. Can I train my model on a given data and check the best scores on a different validation data using Grid Search?

  39. Claudio January 3, 2018 at 2:11 am #

    HI Jason…
    Thanks so much for for this set of article on Keras.. I thinks it’s simply awesome

    I have a question on what is the best way to run the procedure about Grid Search Deep Learning Model Parameters on a Spark cluster. Is it just a matter to port the code there without any change and the optimization is magically enabled ( run in parallel on different node any combination of parameter to test ) …. or should we import other libraries or apply some changes for enabling that?

    thanks a lot in advance

    • Jason Brownlee January 3, 2018 at 5:39 am #

      Sorry, I cannot give you good advice for tuning Keras model with Spark.

  40. vikram singh January 4, 2018 at 11:29 pm #

    How to cross validation when we have multi label classification problem ?
    Whenever I pass the Y_train, I get ‘IndexError: too many indices for array’, how to resolve this ?

  41. Sam Miller January 5, 2018 at 6:56 am #

    Hi Jason,
    How do you obtain the precision and recall scores from the model when using k-folds with KerasClassifier? Is there a method of generating the sklearn.metrics classification report after applying cross_val_score?
    I need these values as my dataset is imbalanced and I want to compare the results from before and after undersampling the data by generating a confusion matrix, ROC curve and precision-recall curve.

    • Jason Brownlee January 5, 2018 at 11:35 am #

      Yes, change the “scoring” argument to one or a list of metrics to report.

  42. yerra January 8, 2018 at 2:47 am #

    Hi Brownlee ,

    Thanks for wonderful material for Machine Learning and Deep Learning. As suggested I have dataset which already split into training and test. Using Keras , How to train model and then predict the model on test data . ?

  43. Reed Guo January 21, 2018 at 1:13 am #

    Hi, Jason

    How to find the best number of hidden layers and number of neurons?

    Can you post the python code? I didn’t find any useful posts by Google.

    Thank you very much.

    • Jason Brownlee January 21, 2018 at 9:12 am #

      Great question.

      You must use trial and error with your specific model on your specific data.

  44. Reed Guo January 21, 2018 at 7:23 pm #

    Hi, Jason

    Thanks for your response.

    I still feel confused. I don’t know how to do? (use grid search or just try and try again?)

    Can you provide a clip of python code for the example in this course?

  45. Reed Guo January 23, 2018 at 12:19 pm #

    Hi, Jason

    Thank you very very much.

  46. Magesh Rathnam February 2, 2018 at 3:48 am #

    Hi Jason – thanks for the post. I was not aware of this wrapper. Do we have a similar wrapper for regressor too? I am new to ML and I was trying to do house price prediction problem. I see that it can be done with scikit random forest regressors. But I want to see if we can do the same with keras as well, since I started with keras and find it little easier. I tried with a simple sequential model, with multiple layers, but it did not work. Can you pls let me know how can I implement keras for my problem?


  47. Atefeh February 6, 2018 at 12:28 am #


    I face to the error:

    what is the problem?

  48. Atefeh February 6, 2018 at 5:35 pm #


    again after importing Kerasclassifier, for Kfold what should i import?

    from keras.wrappers.scikit_learn import KerasClassifier
    def create_model():

    return model
    model=KerasClassifier(build_fn=create_model, epochs=150, batch_size=10)
    kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)

    NameError Traceback (most recent call last)
    in ()
    6 return model
    7 model=KerasClassifier(build_fn=create_model, epochs=150, batch_size=10)
    —-> 8 kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)

    NameError: name ‘StratifiedKFold’ is not defined

    thank you

    • Jason Brownlee February 7, 2018 at 9:22 am #

      Looks like you are still missing some imports.

      Consider using existing code from the blog post as a starting point?

  49. K V Subrahmanyam February 13, 2018 at 8:41 pm #


    Thanks for the brilliant post.I have one question. Why are we not passing epochs and batches as parameter to create_model function. Can you please helping me in understanding this?


    • Jason Brownlee February 14, 2018 at 8:18 am #

      Because we are passing them when we fit the model. The function is only used to define the model, not fit it.

  50. Janina February 18, 2018 at 9:27 am #

    Hi Jason, even though your post seems very straight-forward, still I struggle to implement this grid search approach. It does not give me any error message, BUT it just runs on and on forever without printing out anything. I deliberately tried it with very few epochs and very few hyperparameters to search. Without grid search, one epoch runs through extremely fast, so I don’t think that I just need to give it more time. It simply doesn’t do anything.

    My code:

    Thanks a lot for any help!!

    • Jason Brownlee February 19, 2018 at 8:59 am #

      Perhaps try using a smaller sample of your data?

      Perhaps perform the grid search manually with your own for loop? or distributed across machines?

  51. Rishab Verma March 2, 2018 at 5:33 pm #

    Nothing can be more appreciated than looking at the replies that you have made to other people’s issues.

    I also have an issue. I want to train a SVM classifier on the weights that I have already saved of a model that used fully connected layer of a ResNet. How do I feel in the values to the SVM classifier?
    Any guidance is appreciated.

    • Jason Brownlee March 3, 2018 at 8:07 am #

      Sorry, I don’t know how to load neural net weights into an SVM.

  52. Choi March 6, 2018 at 2:09 pm #

    Hi Jason,
    Thank you for your tutorials.
    Is there any other ways not to use ‘KerasClassifier’ to fit the model?
    Because I want to train the model with augmented data, which is achieved by using ‘ImageDataGenerator’.

    More specifically,

    You can change argument X_train, and Y_train with ‘fit()’ function as written below.

    history =, Y_train, nb_epoch=nb_epoch, batch_size=batch_size, shuffle=True, verbose=1)

    However, you can’t change argument X_train, and Y_train using ‘KerasClassifier’ function as written below, because there are no arguments for input data in this function.

    model = KerasClassifier(build_fn=create_model, epochs=50, batch_size=1, verbose=0)

    Thanks in advance!

  53. Elena Markova April 19, 2018 at 1:42 am #

    Hi Jason,

    Thank you for the article
    Can you explain how the kfold cross validation example with scikit is different from just using validation split = 1/k in keras, when fitting a model? Sorry, Im new to machine learning. Is the only difference is that the validation split opton in keras never changes or shuffles its validation data?

    Thanks in advance!

    • Jason Brownlee April 19, 2018 at 6:36 am #

      A validation split is a single split of the data. One model evaluated on one dataset.

      k-fold cross-validation creates k models evaluated on k disjoint test sets.

      It is a better estimate of the skill of the model trained on a random sample of data of a given size. But comes at an increased computational cost, especially for deep learning models that are really slow to train.

  54. Saber May 27, 2018 at 12:45 am #

    I have read in a few places that k-fold CV is not very common with DL models as they are computationally heavy. But k-fold CV is also used to pick an “optimal” decision/classification threshold for a desired FPR, FNR, etc. before going to the test data set. Do you have any suggestions on how to do “threshold optimization” with DL models? Is k-fold CV the only option?


    • Jason Brownlee May 27, 2018 at 6:46 am #

      Makes sense.

      Yes, each split you can estimate the threshold to use from the train data and test it on the hold out fold.

  55. Sazid May 27, 2018 at 8:59 pm #

    Which one give me good result keras scikit learn wraper class or general machine learning algorithms like as KNN , SVM,decession tree , random forest tree?
    Please help me.I m in confusion.

  56. Mik June 2, 2018 at 1:12 am #

    Hey Jason,

    I’ve followed your blog for some time now and I find your posts on ML very useful, as it sort of fills out some of the holes that many documentation sites leave out. For example, your posts on Keras are lightyears ahead of Keras’ own documentation in terms of clarity. So, please keep up the fantastic work mate.

    I have run into a periodic problem when using Keras. Far from every time, but occasionally (I’d say 1 in every 10 times or so), the code fails with this error:

    Exception ignored in: <bound method BaseSession.__del__ of >
    Traceback (most recent call last):
    File “/home/mede/virtualenv/lib/python3.5/site-packages/tensorflow/python/client/”, line 707, in __del__

    I believe it’s the same error that these stackoverflow posts describe: and Naturally, I have tried to apply the fixes that those posts suggest. However, since (as far as I can see) the error occurs within the loop that executes the grid search I cannot delete the Tensorflow session between each run. Do you know if there is, or can you think of, a workaround for this?

    Thanks in advance!

    • Jason Brownlee June 2, 2018 at 6:35 am #

      I have not seen this before sorry.

      Perhaps ensure that all libs and dependent libs are up to date?
      Perhaps move to Py 3.6?

  57. Mik June 4, 2018 at 7:27 am #

    OK, I’ll give it a shot, thanks.

  58. Chase July 11, 2018 at 6:33 am #

    Thanks for the great content!

    I’m looking for some guidance on building learning curves that specifically show the training and test error of a sequential keras model as the training examples increases. I’ve put together some code already and have gotten hung up on an error that is making me rethink my approach. Any suggestions would be appreciated.

    For reference this is the current error:

    TypeError: Cannot clone object ” (type ): it does not seem to be a scikit-learn estimator as it does not implement a ‘get_params’ methods.

Leave a Reply