Use Keras Deep Learning Models with Scikit-Learn in Python

Keras is one of the most popular deep learning libraries in Python for research and development because of its simplicity and ease of use.

The scikit-learn library is the most popular library for general machine learning in Python.

In this post you will discover how you can use deep learning models from Keras with the scikit-learn library in Python.

This will allow you to leverage the power of the scikit-learn library for tasks like model evaluation and model hyper-parameter optimization.

Let’s get started.

Use Keras Deep Learning Models with Scikit-Learn in Python

Use Keras Deep Learning Models with Scikit-Learn in Python
Photo by Alan Levine, some rights reserved.

Overview

Keras is a popular library for deep learning in Python, but the focus of the library is deep learning. In fact it strives for minimalism, focusing on only what you need to quickly and simply define and build deep learning models.

The scikit-learn library in Python is built upon the SciPy stack for efficient numerical computation. It is a fully featured library for general machine learning and provides many utilities that are useful in the development of deep learning models. Not least:

  • Evaluation of models using resampling methods like k-fold cross validation.
  • Efficient search and evaluation of model hyper-parameters.

The Keras library provides a convenient wrapper for deep learning models to be used as classification or regression estimators in scikit-learn.

In the next sections, we will work through examples of using the KerasClassifier wrapper for a classification neural network created in Keras and used in the scikit-learn library.

The test problem is the Pima Indians onset of diabetes classification dataset. This is a small dataset with all numerical attributes that is easy to work with. Download the dataset and place it in your currently working directly with the name pima-indians-diabetes.csv.

The following examples assume you have successfully installed Keras and scikit-learn.

Need help with Deep Learning in Python?

Take my free 2-week email course and discover MLPs, CNNs and LSTMs (with sample code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

Evaluate Deep Learning Models with Cross Validation

The KerasClassifier and KerasRegressor classes in Keras take an argument build_fn which is the name of the function to call to get your model.

You must define a function called whatever you like that defines your model, compiles it and returns it.

In the example, below we define a function create_model() that create a simple multi-layer neural network for the problem.

We pass this function name to the KerasClassifier class by the build_fn argument. We also pass in additional arguments of nb_epoch=150 and batch_size=10. These are automatically bundled up and passed on to the fit() function which is called internally by the KerasClassifier class.

In this example, we use the scikit-learn StratifiedKFold to perform 10-fold stratified cross-validation. This is a resampling technique that can provide a robust estimate of the performance of a machine learning model on unseen data.

We use the scikit-learn function cross_val_score() to evaluate our model using the cross-validation scheme and print the results.

Running the example displays the skill of the model for each epoch. A total of 10 models are created and evaluated and the final average accuracy is displayed.

Grid Search Deep Learning Model Parameters

The previous example showed how easy it is to wrap your deep learning model from Keras and use it in functions from the scikit-learn library.

In this example, we go a step further. The function that we specify to the build_fn argument when creating the KerasClassifier wrapper can take arguments. We can use these arguments to further customize the construction of the model. In addition, we know we can provide arguments to the fit() function.

In this example, we use a grid search to evaluate different configurations for our neural network model and report on the combination that provides the best-estimated performance.

The create_model() function is defined to take two arguments optimizer and init, both of which must have default values. This will allow us to evaluate the effect of using different optimization algorithms and weight initialization schemes for our network.

After creating our model, we define arrays of values for the parameter we wish to search, specifically:

  • Optimizers for searching different weight values.
  • Initializers for preparing the network weights using different schemes.
  • Epochs for training the model for a different number of exposures to the training dataset.
  • Batches for varying the number of samples before a weight update.

The options are specified into a dictionary and passed to the configuration of the GridSearchCV scikit-learn class. This class will evaluate a version of our neural network model for each combination of parameters (2 x 3 x 3 x 3 for the combinations of optimizers, initializations, epochs and batches). Each combination is then evaluated using the default of 3-fold stratified cross validation.

That is a lot of models and a lot of computation. This is not a scheme that you want to use lightly because of the time it will take. It may be useful for you to design small experiments with a smaller subset of your data that will complete in a reasonable time. This is reasonable in this case because of the small network and the small dataset (less than 1000 instances and 9 attributes).

Finally, the performance and combination of configurations for the best model are displayed, followed by the performance of all combinations of parameters.

This might take about 5 minutes to complete on your workstation executed on the CPU (rather than CPU). running the example shows the results below.

We can see that the grid search discovered that using a uniform initialization scheme, rmsprop optimizer, 150 epochs and a batch size of 5 achieved the best cross-validation score of approximately 75% on this problem.

Summary

In this post, you discovered how you can wrap your Keras deep learning models and use them in the scikit-learn general machine learning library.

You can see that using scikit-learn for standard machine learning operations such as model evaluation and model hyperparameter optimization can save a lot of time over implementing these schemes yourself.

Wrapping your model allowed you to leverage powerful tools from scikit-learn to fit your deep learning models into your general machine learning process.

Do you have any questions about using Keras models in scikit-learn or about this post? Ask your question in the comments and I will do my best to answer.

Frustrated With Your Progress In Deep Learning?

Deep Learning with Python

 What If You Could Develop A Network in Minutes

…with just a few lines of Python

Discover how in my new Ebook: Deep Learning With Python

It covers self-study tutorials and end-to-end projects on topics like:
Multilayer PerceptronsConvolutional Nets and Recurrent Neural Nets, and more…

Finally Bring Deep Learning To
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.

98 Responses to Use Keras Deep Learning Models with Scikit-Learn in Python

  1. Shruthi June 2, 2016 at 6:38 am #

    First, this is extremely helpful. Thanks a lot.

    I’m new to keras and i was trying to optimize other parameters like dropout and number of hidden neurons. The grid search works for the parameters listed above in your example. However, when i try to optimize for dropout the code errors out saying it’s not a legal parameter name. I thought specifying the name as it is in the create_model() function should be enough; obviously I’m wrong.

    in short: if i had to optimize for dropout using GridSearchCV, how would the changes to your code look?

    apologies if my question is naive, trying to learn keras, python and deep learning all at once. Thanks,

    Shruthi

    • Jason Brownlee June 23, 2016 at 10:20 am #

      Great question.

      As you say, you simply add a new parameter to the create_model() function called dropout_rate then make use of that parameter when creating your dropout layers.

      Below is an example of grid searching dropout values in Keras:

      Running the example produces the following output:

      I hope that helps

  2. Rish June 22, 2016 at 10:01 pm #

    Hi Jason,

    Thanks for the post, this is awesome. I’ve found the grid search very helpful.

    One quick question: is there a way to incorporate early stopping into the grid search? With a particular model I am playing with, I find it can often over-train and consequently my validation loss suffers. Whilst I could incorporate an array of epoch parameters (like in your example), it seems more efficient to just have it stop if the validation accuracy increases over a small number of epochs. Unless you have a better idea?

    Thanks again!

    • Jason Brownlee June 23, 2016 at 5:30 am #

      Great comment Rish and really nice idea.

      I don’t think scikit-learn supports early stopping in it’s parameter searching. You would have to rig up your own parameter search and add an early stop clause to it, or consider modifying sklearn itself.

      I also wonder whether you could hook-up Kears check-pointing and capture the best parameter combinations along the way to file (checkpoint callback) and allow you to kill the search at any time.

      • Rish July 19, 2016 at 12:06 pm #

        Thanks for the reply! 🙂 I wonder too!

        • Vadim March 13, 2017 at 8:16 am #

          Hey Jason,

          Awesome article. Did you by any chance find a way to hookup callback functions to grid search? In this case it would be possible to have Tensorboard aggregating and visualizing the grid search outcomes.

  3. Rishabh August 17, 2016 at 3:13 am #

    Hi Jason,

    Great Post. Thanks for this.

    One problem that i have always faced with training a deep learning model (in H2O as well) is that predicted probability distribution is always flat (as in very less variation in probability across the sample). Any other ML model e.g. RF/GBM is easier to tune and gives good results in most cases. So the my doubt is two fold:

    1. Except for lets say image data where CNN might be a good thing to try, in what scenarios should we try to fit a deep learning model.
    2. I think the issues that i face with deep learning models is usually due to underfitting. Can you please give some tips on how to tune a deep learning model (other ML models are easier to tune)

    Thanks

    • Jason Brownlee August 17, 2016 at 9:53 am #

      Deep learning is good on raw data like images, text, audio and similar. It can work on complex tabular data, but often feature engineering + xgboost can do better in practice.

      Keep adding layers/neurons (capacity) and training longer until performance flattens out.

      • Rishabh August 17, 2016 at 7:51 pm #

        Thanks. I will try to add more layers and add some regularization parameters as well

        • Jason Brownlee August 18, 2016 at 7:16 am #

          Good luck Rishabh, let me know how you go.

          • Rishabh September 10, 2016 at 11:26 pm #

            Hi Jason, I was able to tune the model, thanks to your post although GBM had a better fit. I had included prelu layers which improved the fit.

            One question, is there an optimal way to find number of hidden neurons and hidden layer count or grid search the only option?

          • Jason Brownlee September 12, 2016 at 8:27 am #

            Tuning (grid/random) is the best option I know of Rishabh.

          • Rishabh September 12, 2016 at 5:03 pm #

            Thanks. Grid search takes a toll on my 16 GB laptop, hence searching for an optimal way.

          • Jason Brownlee September 13, 2016 at 8:10 am #

            It is punishing. Small grids are kinder on memory.

  4. xiao September 17, 2016 at 10:59 pm #

    Hi Jason,
    Thanks for the post. I’m a beginner about keras, and I met some problems using keras with sk-learn recently. If it’s convenient, could you do me a falvor?

    Details here:
    http://stackoverflow.com/questions/39467496/error-when-using-keras-sk-learn-api

    Thank you!!!

    • Jason Brownlee September 18, 2016 at 8:01 am #

      Ouch. Nothing comes to mind Xiao, sorry.

      I would debug it by cutting it back to the simplest possible network/code that generates the problem then find the line that causes the problem, and go from there.

      If you need a result fast, I would use the same decompositional approach to rapidly find a workaround.

      Let me know how you go.

      • xiao September 18, 2016 at 10:43 am #

        Thanks a lot, Jason.
        I need the result fast if possible.

      • xiao September 21, 2016 at 12:22 pm #

        Hi, Jason. Got any idea?

        • Jason Brownlee September 22, 2016 at 8:06 am #

          Yes, I gave you an approach to debug the problem in the previous comment Xiao.

          I don’t know any more than that, sorry.

          • xiao September 22, 2016 at 4:52 pm #

            Thanks, I’ll try it out.

  5. Josh September 22, 2016 at 12:12 am #

    Jason, Thanks for the tutorial it saved me a lot of time. I am running a huge amount of data on a remote server from shell files. The output of the model is written to an additional shell file in case there is errors. However when I run my code, following how you approach above, it outputs the status of training, i.e. epoch number and accuracy, for every model in gridsearch. Is there a way to suppress this output? I tried to use “verbose=0” as an additional argument both in calling “fit” which created an error and GridsearchCV which did not do anything.

    Thanks

    • Jason Brownlee September 22, 2016 at 8:17 am #

      Great question Josh.

      Pass verbose=0 into the constructor of your classifier:

  6. Tom October 5, 2016 at 9:09 pm #

    Hello Jason,
    First of all thank you so much for your guides and examples regarding Keras and deep learning!! Please keep on going 🙂
    Question 1
    Is it possible to save the best trained model with grid and set some callbacks (for early stopping as well)? I wanted to implement saving the best model by doing
    checkpoint=ModelCheckpoint(filepath, monitor=’val_acc’, verbose=0, save_best_only=True, mode=’max’)
    grid_result = grid.fit(X, Y, callbacks=[checkpoint])
    but TypeError: fit() got an unexpected keyword argument ‘callbacks’
    Question 2
    Is there a way to visualize the trained weights and literaly seeing the created network? I want to make neural networks a bit more practical instead of only classification. I know you can plot the model: plot(model, to_file=’model.png’) but I want to implement my data in this model.
    Thanks in advance,
    Tom

    • Jason Brownlee October 6, 2016 at 9:37 am #

      Hi Tom, I’m glad you’re finding my material valuable.

      Sorry, I don’t know about call-backs in the grid search. Sounds like it might just make a big mess (i.e. not designed to do this).

      I don’t know about built in ways to visualize the weights. Personally, I like to look at the weights for production models to see if something crazy is going on – but just the raw numbers.

      • David August 22, 2017 at 4:46 pm #

        Hi Jason,
        Thanks for the very well written article! Very clear and easy to understand.
        I have a callback that does different types of learning rate annealing. It has four parameters I’d like to optimise.
        Based on your above comment I’m guessing the SciKit wrapper won’t work for optimising this?
        Do you know how I would do this?
        Many thanks for your help.
        David

        • Jason Brownlee August 23, 2017 at 6:41 am #

          You might have to write your own for loop David. In fact, I’d recommend it for the experience and control it offers.

  7. Soren Pallesen October 18, 2016 at 5:08 pm #

    Hi there, thanks for all your inspiration. When running the above example i get a slightly different result:

    Best: 0.751302 using {‘optimizer’: ‘rmsprop’, ‘batch_size’: 5, ‘init’: ‘normal’, ‘nb_epoch’: 150}

    e.i. init: normal and not uniform as in your example.

    Is it normal with these variations?

    • Jason Brownlee October 19, 2016 at 9:16 am #

      Hi Soren,

      Results can vary based on the initialization method. It is hard to predict how they will vary for a given problem.

  8. Saddam December 15, 2016 at 9:20 pm #

    Hey am getting an error “ValueError: init is not a legal parameter”
    code is as follows.
    init = [‘glorot_uniform’, ‘normal’, ‘uniform’]
    batches = numpy.array([50, 100, 150])
    param_grid = dict(init=init)
    print(str(self.model_comp.get_params()))
    grid = GridSearchCV(estimator=self.model_comp, param_grid=param_grid)
    grid_result = grid.fit(X_train, Y_train)

    i can’t figure out what am doing wrong. Help me out here.

    • Jason Brownlee December 16, 2016 at 5:41 am #

      Sorry Saddam, the cause does not seem obvious.

      Perhaps post more of the error message or consider posting the question on stack overflow?

      • Palash Goyal January 15, 2017 at 11:36 pm #

        Hi Saddam and Jason,

        The error is due to the init parameter ‘glorot_uniform’.
        Seems like it has been deprecated or something, once you remove this from the possible values (i.e., init=[‘uniform’,’normal’]) your code will work.

        Thanks

        • Jason Brownlee January 16, 2017 at 10:41 am #

          It does not appear to be deprecated:
          https://keras.io/initializations/

          • Nils Holgersson January 23, 2017 at 4:11 am #

            Don’t forget to add the parameter (related to parameters inside the creation of your model) that you want to iterate over as input parameter to the function that creates your model.

            def create_model(init=’normal’):

  9. Tameru December 21, 2016 at 9:44 am #

    Hi Jason,

    recently, I tried to use K-fold cross validation for Image classification problem and found the following error

    training Image X shape= (2041,64,64)
    label y shape= (2041,2)

    code:

    model = KerasClassifier(build_fn=creat_model, nb_epoch=15, batch_size=10, verbose=0)

    # evaluate using 6-fold cross validation

    kfold = StratifiedKFold(n_splits=6, shuffle=False, random_state=seed)

    results = cross_val_score(model, x, y, cv=kfold)

    print results

    print ‘Mean=’,

    print(results.mean())

    an error:

    IndexError Traceback (most recent call last)
    in ()
    2 # evaluate using 6-fold cross validation
    3 kfold = StratifiedKFold(n_splits=6, shuffle=False, random_state=seed)
    —-> 4 results = cross_val_score(model, x, y, cv=kfold)
    5 print results
    6 print ‘Mean=’,

    IndexError: too many indices for array

    I don’t understand what is wrong here?
    Thanks,

    • Jason Brownlee December 22, 2016 at 6:29 am #

      Sorry Tameru, I have not seen this error before. Perhaps try stack overflow?

    • coyan January 5, 2017 at 6:50 pm #

      Hi, I have the same problem. Do you have any easy way to solve it?

      Thanks!

      • Jason Brownlee January 6, 2017 at 9:07 am #

        Sounds like it could be a data issue.

        Perhaps your data does not have enough observations in each class to split into 6 groups?

        Ideas:
        – Try different values of k.
        – Try using KFold rather than StratifiedKFild
        – Try a train/test split

        • Tameru January 8, 2017 at 7:04 am #

          @ Coyan, I am still trying to solve it. let us try Jason’s advice. Thanks, Jason.

    • JiaMingLin January 20, 2017 at 8:47 pm #

      Hi

      I also encounter this problem, and I guess the scikit-learn k-fold functions do not accept the “one hot” vectors. You may try on StratifiedShuffleSplit with the list of “one hot” vectors.

      And that means you can only evaluate Keras model with scikit-learn in the binary classification problem.

  10. Ali January 15, 2017 at 7:52 pm #

    Dear Dr. Brownlee,

    Your results here are around 75%. My experiment with my data result in around 85%. Is this considered good result?

    Because DNN and RNN are known for their great performances. I wonder if it’s normal and how we can improve the results.

    Regards,

  11. Miriam February 3, 2017 at 1:01 am #

    Hi Jason,

    I cannot tell you how awesome your tutorials are in terms of saving me time trying to understand Keras,

    However, I have run into a conceptual wall re: training, validation and testing. I originally understood that wrapping Keras in Gridsearch helped me tune my hyperparameters. So with GridsearchCV, there is no separate training and validation sets. This I can live with as it is the case with any CV.

    But then I want to use Keras to predict my outcomes on the model with optimized hyperparameters. Every example I see for model.fit/model.evaluate, uses the argument validation_data (or validation_split) and I’m understand that we’re using our test set as a validation set — a real no no.

    Please see https://github.com/fchollet/keras/issues/1753 for a discussion of this and proof that I am not theonly one confused.

    SO MY QUESTION IS: In completing your wonderful cookbook how to’s for novices, after I have found all my hyperparameters, how do I run my test data?

    If I use model.fit, won’t the test data be unlawfully used to retrain? What exactly is happening with the validation_data or _split argument in model.fit in keras????

    • Jason Brownlee February 3, 2017 at 10:05 am #

      Hi Miriam,

      Generally, you can hold out a separate validation set for evaluating a final model and set of parameters. This is recommended.

      Model selection and tuning can be performed on the same test set using a suitable resampling method (k-fold cross validation with repeats). Ideally, separate datasets would be used for algorithm selection and parameter selection, but is often too expensive (in terms available data).

  12. wenger March 6, 2017 at 5:12 pm #

    i follow the step ,but error

    ValueError: optimizer is not a legal parameter

    i don’t konw how to deal with it

    • Jason Brownlee March 7, 2017 at 9:35 am #

      I’m sorry to hear that wenger. Perhaps confirm that you have the latest versions of Keras and sklearn installed?

  13. myk March 31, 2017 at 9:25 pm #

    hello how can i implement SVM machine learning algorithm by using scikit library with keras

    • Jason Brownlee April 1, 2017 at 5:54 am #

      Keras is for deep learning, not SVM. You only need sklearn.

  14. Ronen April 3, 2017 at 10:02 pm #

    Hi Jason,

    As always very informative and skillfully written post.

    In the face of an extremely unbalanced data set, how would pipeline under-sampling pre-processing step, in the example above.

    Thanks !

  15. Jens April 16, 2017 at 6:54 am #

    Hey Jason,

    I played around with your code for my own project and encountered an issue that I get different results when using the pipeline (both without standardization)
    I posted this also on crossvalidated:
    https://stats.stackexchange.com/questions/273911/different-results-for-keras-sklearn-wrapper-with-and-without-use-of-pipline

    Can you help me with that?
    Thank you

  16. Carlton banks April 30, 2017 at 11:15 am #

    Could you provide an example in which data generator and fit_generator is being used..

  17. Anni May 3, 2017 at 7:59 pm #

    Hi Jason,

    Thanks for the post, this is awesome. But i’m facing

    ValueError: Can’t handle mix of multilabel-indicator and binary

    when i set scoring function to precision, recall, or f1 in cross validation. But it works fine if i didn’t set scoring function just like you did. Do you have any easy way to solve it?

    Big Thanks!

    Here’s the code :
    scores2=cross_val_score(model, X_train.as_matrix(), y_train, cv=10, scoring=’precision’)

    • Jason Brownlee May 4, 2017 at 8:06 am #

      Is this happening with the dataset used in this tutorial?

      • Anni May 5, 2017 at 4:57 am #

        No, actually its happening with breast cancer data set from UCI Machine Learning. I used train_test_split to split the data into training and testing. But when I tried to fit the model, i got

        IndexError: indices are out-of-bounds.

        So I tried to modify y_train by following code :

        y_train = np_utils.to_categorical(y_train)

        Do you have any idea ? i tried to solve this error for a week and still cannot fixed the problem.

        • Jason Brownlee May 5, 2017 at 7:34 am #

          I believe all the variables in that dataset are categorical.

          I expect you will need to use an integer encoding and a one hot encoding for each variable.

          • Anni May 5, 2017 at 10:24 pm #

            Okay, I will try your suggestion. Thanks for your reply 🙂

  18. Edward May 12, 2017 at 7:24 am #

    Jason is there something in deep learning like feature_importance in xgboost?
    For images it makes no sense though in this case it can be important

    • Jason Brownlee May 12, 2017 at 7:52 am #

      There may be, I’m not aware of it.

      You could use a neural net within an RFE process.

      • Edward May 12, 2017 at 7:59 am #

        Thanks for your reply, Jason! Your blog is GOLD 🙂

  19. Adrian May 26, 2017 at 12:04 am #

    Do you know how to save the hyperparameters with the TensorBoard callback?

  20. Anisa May 28, 2017 at 6:12 pm #

    Hi Jason!

    I’m doing grid search with my own scoring function, but I need to get result like accuracy and recall from training model. So, I use cross_val_score with best params that I get from grid search. But then cross_val_score produce different result with best score that I got from grid search. Do you have any idea to solve my problem?

    Thanks,

  21. Shabran May 30, 2017 at 3:41 pm #

    Hi Jason!

    I’m doin grid search with my own function as scoring function, but i need to reports other metrics from best param that I got from grid search. So, I’m doin cross validation with best params. But the problem is cross validation produce different result with best score from grid search. The different really significan. Do you have any idea to solve this problem?

    Thanks.

  22. Szymon June 6, 2017 at 6:52 am #

    How could we use Keras with ensembles, let’s take Voting. When base models are from sklearn library everything works fine, but with Keras I’m getting ‘TypeError: cannot create ‘sys.flags’ instances’. Do you know any work around?

    • Jason Brownlee June 6, 2017 at 10:09 am #

      You can perform voting manually on the lists of predictions from each sub model.

  23. Kirana June 14, 2017 at 3:12 pm #

    I’m evaluating 3 algorithms (SVM-RBF, XGBoost, and MLP) against small datasets. Is it true if SVM and XGBoost are suitable for small datasets whereas deep learning require “relatively” large datasets to work well? Could you please explain to me?

    Thanks a lot.

    • Jason Brownlee June 15, 2017 at 8:42 am #

      Perhaps, let the results and hard data drive your model selection.

  24. James June 17, 2017 at 2:55 am #

    Great article. It seems similar to what I am currently learning about. I’m using the sklearn wrapper for Keras; particularly the KerasClassifier and sklearn’s cross_val_score(). I am running into an issue with the n_jobs parameter for cross_val_score. I’d like to take advantage of the GPU on my Windows 10 machine, but am getting Blas GEMM launch failed when n_jobs = -1 or any value above 1. Also getting different results if I run it from the shell prompt vs. the python interpreter. Any ideas what I can do to get this to work?

    Windows 10
    Geforce 1080 Ti
    Tensorflow GPU
    Python 3.6 via Anaconda
    Keras 2.0.5

    • Jason Brownlee June 17, 2017 at 7:33 am #

      Sorry James, I don’t have good advice on setting up GPUs on windows for Keras.

      Perhaps you can post on stackoverflow?

  25. Anastasios Selalmazidis June 18, 2017 at 2:17 am #

    Hello Jason,

    when I run your first example, the one which uses StratifiedKFold, with a multiclass dataset of mine, I get an errorIsn’t it possible to run StratifiedKFold with multiclass ? I also have the same problem “”IndexError: too many indices for array”. when try to run a GridSearch with StritifiedKFold

  26. Sean June 27, 2017 at 5:24 pm #

    Hi Jason,
    I follow your steps and the program takes more than 40 minutes to run. Therefore, I gave up waiting in the middle. Do you know if there is any way to speed up the GridSearchCV? Or is it normal to wait more than 40 minutes to run this code on a 2013 Mac? Thank you

    • Jason Brownlee June 28, 2017 at 6:18 am #

      You could cut down the number of parameters that are being searched.

      I often run grid searches that run for weeks on AWS.

  27. Clarence Wong June 27, 2017 at 8:31 pm #

    Hi Jason,

    How do I save the model when wrapping a Keras Classifier in Scikit’s GridSearchCV?

    Do I treat it as a scikit object and use pickle/joblib or do I use the model.save method native to Keras?

    • Jason Brownlee June 28, 2017 at 6:23 am #

      Sorry, I have not tried this. It may not be supported out of the box.

  28. Karthik July 15, 2017 at 11:39 am #

    Hi Jason,

    I want to perform stratified K-fold cross-validation with a model that predicts a distance and not a binary label. Is it possible to provide a distance threshold to the cross-validation method in scikit-learn (or is there some other approach), i.e., distance < 0.5 to be treated as a positive label (y=1) and a negative label (y=0) otherwise?

    • Jason Brownlee July 16, 2017 at 7:56 am #

      Stratification requires a class value.

      Perhaps you can frame your problem so the outcome distances are thresholded class labels.

      Note, this is a question of how you frame your prediction problem and prepare your data, not the sklearn library.

  29. Merco July 15, 2017 at 11:12 pm #

    Very cool!! And you can use Recursive Feature Elimination sklearn function by this way too?

  30. ambika July 19, 2017 at 4:33 pm #

    Please could you tell me formula for relu function , i need it for regression.

  31. ds July 21, 2017 at 7:34 pm #

    Is it possible to use your code with more complex model architectures? I have 3 different inputs and branches (with ConvLayers), which are concatenated at the end to form a dense layer.
    I tried to call the grid.fit() function as follows:
    grid_result = grid.fit({‘input_1’: train_input_1, ‘input_2’: train_input_2, ‘input_3’: train_input_3}, {‘main_output’: lebels_train})

    I’m getting an error:
    “ValueError: found input variables with inconsistent numbers of samples: [3, 1]”

    Do you have any experience on that?

    • Jason Brownlee July 22, 2017 at 8:32 am #

      I don’t sorry.

      I would recommend using native Keras.

  32. Nas October 10, 2017 at 4:23 pm #

    from keras.models import Sequential
    from keras.layers import Dense
    from keras.wrappers.scikit_learn import KerasClassifier
    from sklearn.model_selection import StratifiedKFold
    from sklearn.model_selection import cross_val_score
    import numpy

    def create_model():
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation=’relu’))
    model.add(Dense(8, activation=’relu’))
    model.add(Dense(1, activation=’sigmoid’))
    # Compile model
    model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
    return model

    seed = 7
    numpy.random.seed(seed)
    dataset = numpy.loadtxt(“/home/nasrin/nslkdd/NSL_KDD-master/KDDTrain+.csv”, delimiter=”,”)
    X = dataset[:,0:41]
    Y = dataset[:,41]

    model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)
    kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
    results = cross_val_score(model, X, Y, cv=kfold)
    print(results.mean())

    Using TensorFlow backend.
    Traceback (most recent call last):
    File “nsl2.py”, line 20, in
    dataset = numpy.loadtxt(“/home/nasrin/nslkdd/NSL_KDD-master/KDDTrain+.csv”, delimiter=”,”)
    File “/home/nasrin/.local/lib/python3.5/site-packages/numpy/lib/npyio.py”, line 1024, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]
    File “/home/nasrin/.local/lib/python3.5/site-packages/numpy/lib/npyio.py”, line 1024, in
    items = [conv(val) for (conv, val) in zip(converters, vals)]
    File “/home/nasrin/.local/lib/python3.5/site-packages/numpy/lib/npyio.py”, line 725, in floatconv
    return float(x)
    ValueError: could not convert string to float: b’tcp’
    sorry to bother you, what step i missed here.

  33. Piotr October 12, 2017 at 10:33 pm #

    Do you know, how to do K.clear_session() inside cross_val_score(), between folds?

    I have huge CNN networks, but they fits in my memory when I do just one training. The problem is, when I do cross-validation using sci-kit learn and cross_val_score function. I see that memory is increasing with each fold. Do you know hot to change that? After all, after each fold we have to remember just results, not huge model with all weights.

    I’ve tried to use on_train_end callback from keras, but this doesn’t work as model is wiped out before evaluating. So do you know if exists other solution? Unfortunately I don’t see any callbacks in cross_val_score function…

    I will be very glad for your help 🙂

  34. Sanaz October 25, 2017 at 10:17 pm #

    So many thanks for such a helpful code! I have a problem, despite defining the init both in the model and the dictionary in the same way that you have defined it here, I get an error:

    ‘{} is not a legal parameter’.format(params_name))

    ValueError: init is not a legal parameter

    Could you please help me with this problem?

  35. Ab November 9, 2017 at 9:31 pm #

    Hi,
    Thank for your nice paper.
    I have tried to use AdaBoostClassifier( model, n_estimators=2, learning_rate=1.5, algorithm=”SAMME”)
    and used CNN as ‘model’. However I get the following error:

    File “”, line 1, in
    runfile(‘/home/aboozar/Sphere_FE/adaboost/adaboost_CNN3.py’, wdir=’~/adaboost’)

    File “/usr/local/lib/python2.7/dist-packages/spyder/utils/site/sitecustomize.py”, line 688, in runfile
    execfile(filename, namespace)

    File “/usr/local/lib/python2.7/dist-packages/spyder/utils/site/sitecustomize.py”, line 93, in execfile
    builtins.execfile(filename, *where)

    File “/~/adaboost_CNN3.py”, line 234, in
    bdt_discrete.fit(X_train, y_train)

    File “/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/weight_boosting.py”, line 413, in fit
    return super(AdaBoostClassifier, self).fit(X, y, sample_weight)

    File “/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/weight_boosting.py”, line 130, in fit
    self._validate_estimator()

    File “/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/weight_boosting.py”, line 431, in _validate_estimator
    % self.base_estimator_.__class__.__name__)

    ValueError: KerasClassifier doesn’t support sample_weight.

    Do you have any advice?

Leave a Reply