HyperOpt for Automated Machine Learning With Scikit-Learn

By Jason Brownlee on September 7, 2020 in Python Machine Learning 23

Automated Machine Learning (AutoML) refers to techniques for automatically discovering well-performing models for predictive modeling tasks with very little user involvement.

HyperOpt is an open-source library for large scale AutoML and HyperOpt-Sklearn is a wrapper for HyperOpt that supports AutoML with HyperOpt for the popular Scikit-Learn machine learning library, including the suite of data preparation transforms and classification and regression algorithms.

In this tutorial, you will discover how to use HyperOpt for automatic machine learning with Scikit-Learn in Python.

After completing this tutorial, you will know:

Hyperopt-Sklearn is an open-source library for AutoML with scikit-learn data preparation and machine learning models.
How to use Hyperopt-Sklearn to automatically discover top-performing models for classification tasks.
How to use Hyperopt-Sklearn to automatically discover top-performing models for regression tasks.

Let’s get started.

HyperOpt for Automated Machine Learning With Scikit-Learn
Photo by Neil Williamson, some rights reserved.

Tutorial Overview

This tutorial is divided into four parts; they are:

HyperOpt and HyperOpt-Sklearn
How to Install and Use HyperOpt-Sklearn
HyperOpt-Sklearn for Classification
HyperOpt-Sklearn for Regression

HyperOpt and HyperOpt-Sklearn

HyperOpt is an open-source Python library for Bayesian optimization developed by James Bergstra.

It is designed for large-scale optimization for models with hundreds of parameters and allows the optimization procedure to be scaled across multiple cores and multiple machines.

The library was explicitly used to optimize machine learning pipelines, including data preparation, model selection, and model hyperparameters.

Our approach is to expose the underlying expression graph of how a performance metric (e.g. classification accuracy on validation examples) is computed from hyperparameters that govern not only how individual processing steps are applied, but even which processing steps are included.

— Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures, 2013.

HyperOpt is challenging to use directly, requiring the optimization procedure and search space to be carefully specified.

An extension to HyperOpt was created called HyperOpt-Sklearn that allows the HyperOpt procedure to be applied to data preparation and machine learning models provided by the popular Scikit-Learn open-source machine learning library.

HyperOpt-Sklearn wraps the HyperOpt library and allows for the automatic search of data preparation methods, machine learning algorithms, and model hyperparameters for classification and regression tasks.

… we introduce Hyperopt-Sklearn: a project that brings the benefits of automatic algorithm configuration to users of Python and scikit-learn. Hyperopt-Sklearn uses Hyperopt to describe a search space over possible configurations of Scikit-Learn components, including preprocessing and classification modules.

— Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn, 2014.

Now that we are familiar with HyperOpt and HyperOpt-Sklearn, let’s look at how to use HyperOpt-Sklearn.

How to Install and Use HyperOpt-Sklearn

The first step is to install the HyperOpt library.

This can be achieved using the pip package manager as follows:

sudo pip install hyperopt

1	sudo pip install hyperopt

Once installed, we can confirm that the installation was successful and check the version of the library by typing the following command:

sudo pip show hyperopt

1	sudo pip show hyperopt

This will summarize the installed version of HyperOpt, confirming that a modern version is being used.

Name: hyperopt
Version: 0.2.3
Summary: Distributed Asynchronous Hyperparameter Optimization
Home-page: http://hyperopt.github.com/hyperopt/
Author: James Bergstra
Author-email: james.bergstra@gmail.com
License: BSD
Location: ...
Requires: tqdm, six, networkx, future, scipy, cloudpickle, numpy
Required-by:

Name: hyperopt

Version: 0.2.3

Summary: Distributed Asynchronous Hyperparameter Optimization

Home-page: http://hyperopt.github.com/hyperopt/

Author: James Bergstra

Author-email: james.bergstra@gmail.com

License: BSD

Location: ...

Requires: tqdm, six, networkx, future, scipy, cloudpickle, numpy

Required-by:

Next, we must install the HyperOpt-Sklearn library.

This too can be installed using pip, although we must perform this operation manually by cloning the repository and running the installation from the local files, as follows:

git clone git@github.com:hyperopt/hyperopt-sklearn.git
cd hyperopt-sklearn
sudo pip install .
cd ..

git clone git@github.com:hyperopt/hyperopt-sklearn.git

cd hyperopt-sklearn

sudo pip install .

cd ..

Again, we can confirm that the installation was successful by checking the version number with the following command:

sudo pip show hpsklearn

1	sudo pip show hpsklearn

This will summarize the installed version of HyperOpt-Sklearn, confirming that a modern version is being used.

Name: hpsklearn
Version: 0.0.3
Summary: Hyperparameter Optimization for sklearn
Home-page: http://hyperopt.github.com/hyperopt-sklearn/
Author: James Bergstra
Author-email: anon@anon.com
License: BSD
Location: ...
Requires: nose, scikit-learn, numpy, scipy, hyperopt
Required-by:

Name: hpsklearn

Version: 0.0.3

Summary: Hyperparameter Optimization for sklearn

Home-page: http://hyperopt.github.com/hyperopt-sklearn/

Author: James Bergstra

Author-email: anon@anon.com

License: BSD

Location: ...

Requires: nose, scikit-learn, numpy, scipy, hyperopt

Required-by:

Now that the required libraries are installed, we can review the HyperOpt-Sklearn API.

Using HyperOpt-Sklearn is straightforward. The search process is defined by creating and configuring an instance of the HyperoptEstimator class.

The algorithm used for the search can be specified via the “algo” argument, the number of evaluations performed in the search is specified via the “max_evals” argument, and a limit can be imposed on evaluating each pipeline via the “trial_timeout” argument.

...
# define search
model = HyperoptEstimator(..., algo=tpe.suggest, max_evals=50, trial_timeout=120)

...

# define search

model = HyperoptEstimator(..., algo=tpe.suggest, max_evals=50, trial_timeout=120)

Many different optimization algorithms are available, including:

Random Search
Tree of Parzen Estimators
Annealing
Tree
Gaussian Process Tree

The “Tree of Parzen Estimators” is a good default, and you can learn more about the types of algorithms in the paper “Algorithms for Hyper-Parameter Optimization. [PDF]”

For classification tasks, the “classifier” argument specifies the search space of models, and for regression, the “regressor” argument specifies the search space of models, both of which can be set to use predefined lists of models provided by the library, e.g. “any_classifier” and “any_regressor“.

Similarly, the search space of data preparation is specified via the “preprocessing” argument and can also use a pre-defined list of preprocessing steps via “any_preprocessing.

...
# define search
model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), ...)

...

# define search

model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), ...)

For more on the other arguments to the search, you can review the source code for the class directly:

Arguments to the HyperoptEstimator Class

Once the search is defined, it can be executed by calling the fit() function.

...
# perform the search
model.fit(X_train, y_train)

...

# perform the search

model.fit(X_train, y_train)

At the end of the run, the best-performing model can be evaluated on new data by calling the score() function.

...
# summarize performance
acc = model.score(X_test, y_test)
print("Accuracy: %.3f" % acc)

...

# summarize performance

acc = model.score(X_test, y_test)

print("Accuracy: %.3f" % acc)

Finally, we can retrieve the Pipeline of transforms, models, and model configurations that performed the best on the training dataset via the best_model() function.

...
# summarize the best model
print(model.best_model())

...

# summarize the best model

print(model.best_model())

Now that we are familiar with the API, let’s look at some worked examples.

HyperOpt-Sklearn for Classification

In this section, we will use HyperOpt-Sklearn to discover a model for the sonar dataset.

The sonar dataset is a standard machine learning dataset comprised of 208 rows of data with 60 numerical input variables and a target variable with two class values, e.g. binary classification.

Using a test harness of repeated stratified 10-fold cross-validation with three repeats, a naive model can achieve an accuracy of about 53 percent. A top-performing model can achieve accuracy on this same test harness of about 88 percent. This provides the bounds of expected performance on this dataset.

The dataset involves predicting whether sonar returns indicate a rock or simulated mine.

No need to download the dataset; we will download it automatically as part of our worked examples.

The example below downloads the dataset and summarizes its shape.

# summarize the sonar dataset
from pandas import read_csv
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'
dataframe = read_csv(url, header=None)
# split into input and output elements
data = dataframe.values
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)

# summarize the sonar dataset

from pandas import read_csv

# load dataset

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'

dataframe = read_csv(url, header=None)

# split into input and output elements

data = dataframe.values

X, y = data[:, :-1], data[:, -1]

print(X.shape, y.shape)

Running the example downloads the dataset and splits it into input and output elements. As expected, we can see that there are 208 rows of data with 60 input variables.

(208, 60) (208,)

1	(208, 60) (208,)

Next, let’s use HyperOpt-Sklearn to find a good model for the sonar dataset.

We can perform some basic data preparation, including converting the target string to class labels, then split the dataset into train and test sets.

...
# minimally prepare dataset
X = X.astype('float32')
y = LabelEncoder().fit_transform(y.astype('str'))
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

...

# minimally prepare dataset

X = X.astype('float32')

y = LabelEncoder().fit_transform(y.astype('str'))

# split into train and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

Next, we can define the search procedure. We will explore all classification algorithms and all data transforms available to the library and use the TPE, or Tree of Parzen Estimators, search algorithm, described in “Algorithms for Hyper-Parameter Optimization.”

The search will evaluate 50 pipelines and limit each evaluation to 30 seconds.

...
# define search
model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), algo=tpe.suggest, max_evals=50, trial_timeout=30)

...

# define search

model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), algo=tpe.suggest, max_evals=50, trial_timeout=30)

We then start the search.

...
# perform the search
model.fit(X_train, y_train)

...

# perform the search

model.fit(X_train, y_train)

At the end of the run, we will report the performance of the model on the holdout dataset and summarize the best performing pipeline.

...
# summarize performance
acc = model.score(X_test, y_test)
print("Accuracy: %.3f" % acc)
# summarize the best model
print(model.best_model())

...

# summarize performance

acc = model.score(X_test, y_test)

print("Accuracy: %.3f" % acc)

# summarize the best model

print(model.best_model())

Tying this together, the complete example is listed below.

# example of hyperopt-sklearn for the sonar classification dataset
from pandas import read_csv
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from hpsklearn import HyperoptEstimator
from hpsklearn import any_classifier
from hpsklearn import any_preprocessing
from hyperopt import tpe
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'
dataframe = read_csv(url, header=None)
# split into input and output elements
data = dataframe.values
X, y = data[:, :-1], data[:, -1]
# minimally prepare dataset
X = X.astype('float32')
y = LabelEncoder().fit_transform(y.astype('str'))
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
# define search
model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), algo=tpe.suggest, max_evals=50, trial_timeout=30)
# perform the search
model.fit(X_train, y_train)
# summarize performance
acc = model.score(X_test, y_test)
print("Accuracy: %.3f" % acc)
# summarize the best model
print(model.best_model())

# example of hyperopt-sklearn for the sonar classification dataset

from pandas import read_csv

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder

from hpsklearn import HyperoptEstimator

from hpsklearn import any_classifier

from hpsklearn import any_preprocessing

from hyperopt import tpe

# load dataset

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'

dataframe = read_csv(url, header=None)

# split into input and output elements

data = dataframe.values

X, y = data[:, :-1], data[:, -1]

# minimally prepare dataset

X = X.astype('float32')

y = LabelEncoder().fit_transform(y.astype('str'))

# split into train and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

# define search

model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), algo=tpe.suggest, max_evals=50, trial_timeout=30)

# perform the search

model.fit(X_train, y_train)

# summarize performance

acc = model.score(X_test, y_test)

print("Accuracy: %.3f" % acc)

# summarize the best model

print(model.best_model())

Running the example may take a few minutes.

The progress of the search will be reported and you will see some warnings that you can safely ignore.

At the end of the run, the best-performing model is evaluated on the holdout dataset and the Pipeline discovered is printed for later use.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

In this case, we can see that the chosen model achieved an accuracy of about 85.5 percent on the holdout test set. The Pipeline involves a gradient boosting model with no pre-processing.

Accuracy: 0.855
{'learner': GradientBoostingClassifier(ccp_alpha=0.0, criterion='friedman_mse', init=None,
                           learning_rate=0.009132299586303643, loss='deviance',
                           max_depth=None, max_features='sqrt',
                           max_leaf_nodes=None, min_impurity_decrease=0.0,
                           min_impurity_split=None, min_samples_leaf=1,
                           min_samples_split=2, min_weight_fraction_leaf=0.0,
                           n_estimators=342, n_iter_no_change=None,
                           presort='auto', random_state=2,
                           subsample=0.6844206624548879, tol=0.0001,
                           validation_fraction=0.1, verbose=0,
                           warm_start=False), 'preprocs': (), 'ex_preprocs': ()}

Accuracy: 0.855

{'learner': GradientBoostingClassifier(ccp_alpha=0.0, criterion='friedman_mse', init=None,

learning_rate=0.009132299586303643, loss='deviance',

max_depth=None, max_features='sqrt',

max_leaf_nodes=None, min_impurity_decrease=0.0,

min_impurity_split=None, min_samples_leaf=1,

min_samples_split=2, min_weight_fraction_leaf=0.0,

n_estimators=342, n_iter_no_change=None,

presort='auto', random_state=2,

subsample=0.6844206624548879, tol=0.0001,

validation_fraction=0.1, verbose=0,

warm_start=False), 'preprocs': (), 'ex_preprocs': ()}

The printed model can then be used directly, e.g. the code copy-pasted into another project.

Next, let’s take a look at using HyperOpt-Sklearn for a regression predictive modeling problem.

HyperOpt-Sklearn for Regression

In this section, we will use HyperOpt-Sklearn to discover a model for the housing dataset.

The housing dataset is a standard machine learning dataset comprised of 506 rows of data with 13 numerical input variables and a numerical target variable.

Using a test harness of repeated stratified 10-fold cross-validation with three repeats, a naive model can achieve a mean absolute error (MAE) of about 6.6. A top-performing model can achieve a MAE on this same test harness of about 1.9. This provides the bounds of expected performance on this dataset.

The dataset involves predicting the house price given details of the house suburb in the American city of Boston.

No need to download the dataset; we will download it automatically as part of our worked examples.

The example below downloads the dataset and summarizes its shape.

# summarize the auto insurance dataset
from pandas import read_csv
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'
dataframe = read_csv(url, header=None)
# split into input and output elements
data = dataframe.values
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)

# summarize the auto insurance dataset

from pandas import read_csv

# load dataset

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'

dataframe = read_csv(url, header=None)

# split into input and output elements

data = dataframe.values

X, y = data[:, :-1], data[:, -1]

print(X.shape, y.shape)

Running the example downloads the dataset and splits it into input and output elements. As expected, we can see that there are 63 rows of data with one input variable.

(208, 60), (208,)

1	(208, 60), (208,)

Next, we can use HyperOpt-Sklearn to find a good model for the auto insurance dataset.

Using HyperOpt-Sklearn for regression is the same as using it for classification, except the “regressor” argument must be specified.

In this case, we want to optimize the MAE, therefore, we will set the “loss_fn” argument to the mean_absolute_error() function provided by the scikit-learn library.

...
# define search
model = HyperoptEstimator(regressor=any_regressor('reg'), preprocessing=any_preprocessing('pre'), loss_fn=mean_absolute_error, algo=tpe.suggest, max_evals=50, trial_timeout=30)

...

# define search

model = HyperoptEstimator(regressor=any_regressor('reg'), preprocessing=any_preprocessing('pre'), loss_fn=mean_absolute_error, algo=tpe.suggest, max_evals=50, trial_timeout=30)

Tying this together, the complete example is listed below.

# example of hyperopt-sklearn for the housing regression dataset
from pandas import read_csv
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from hpsklearn import HyperoptEstimator
from hpsklearn import any_regressor
from hpsklearn import any_preprocessing
from hyperopt import tpe
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'
dataframe = read_csv(url, header=None)
# split into input and output elements
data = dataframe.values
data = data.astype('float32')
X, y = data[:, :-1], data[:, -1]
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
# define search
model = HyperoptEstimator(regressor=any_regressor('reg'), preprocessing=any_preprocessing('pre'), loss_fn=mean_absolute_error, algo=tpe.suggest, max_evals=50, trial_timeout=30)
# perform the search
model.fit(X_train, y_train)
# summarize performance
mae = model.score(X_test, y_test)
print("MAE: %.3f" % mae)
# summarize the best model
print(model.best_model())

# example of hyperopt-sklearn for the housing regression dataset

from pandas import read_csv

from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_absolute_error

from hpsklearn import HyperoptEstimator

from hpsklearn import any_regressor

from hpsklearn import any_preprocessing

from hyperopt import tpe

# load dataset

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'

dataframe = read_csv(url, header=None)

# split into input and output elements

data = dataframe.values

data = data.astype('float32')

X, y = data[:, :-1], data[:, -1]

# split into train and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

# define search

model = HyperoptEstimator(regressor=any_regressor('reg'), preprocessing=any_preprocessing('pre'), loss_fn=mean_absolute_error, algo=tpe.suggest, max_evals=50, trial_timeout=30)

# perform the search

model.fit(X_train, y_train)

# summarize performance

mae = model.score(X_test, y_test)

print("MAE: %.3f" % mae)

# summarize the best model

print(model.best_model())

Running the example may take a few minutes.

The progress of the search will be reported and you will see some warnings that you can safely ignore.

At the end of the run, the best performing model is evaluated on the holdout dataset and the Pipeline discovered is printed for later use.

In this case, we can see that the chosen model achieved a MAE of about 0.883 on the holdout test set, which appears skillful. The Pipeline involves an XGBRegressor model with no pre-processing.

Note: for the search to use XGBoost, you must have the XGBoost library installed.

MAE: 0.883
{'learner': XGBRegressor(base_score=0.5, booster='gbtree',
             colsample_bylevel=0.5843250948679669, colsample_bynode=1,
             colsample_bytree=0.6635160670570662, gamma=6.923399395303031e-05,
             importance_type='gain', learning_rate=0.07021104887683309,
             max_delta_step=0, max_depth=3, min_child_weight=5, missing=nan,
             n_estimators=4000, n_jobs=1, nthread=None, objective='reg:linear',
             random_state=0, reg_alpha=0.5690202874759704,
             reg_lambda=3.3098341637038, scale_pos_weight=1, seed=1,
             silent=None, subsample=0.7194797262656784, verbosity=1), 'preprocs': (), 'ex_preprocs': ()}

MAE: 0.883

{'learner': XGBRegressor(base_score=0.5, booster='gbtree',

colsample_bylevel=0.5843250948679669, colsample_bynode=1,

colsample_bytree=0.6635160670570662, gamma=6.923399395303031e-05,

importance_type='gain', learning_rate=0.07021104887683309,

max_delta_step=0, max_depth=3, min_child_weight=5, missing=nan,

n_estimators=4000, n_jobs=1, nthread=None, objective='reg:linear',

random_state=0, reg_alpha=0.5690202874759704,

reg_lambda=3.3098341637038, scale_pos_weight=1, seed=1,

silent=None, subsample=0.7194797262656784, verbosity=1), 'preprocs': (), 'ex_preprocs': ()}

Summary

In this tutorial, you discovered how to use HyperOpt for automatic machine learning with Scikit-Learn in Python.

Specifically, you learned:

Hyperopt-Sklearn is an open-source library for AutoML with scikit-learn data preparation and machine learning models.
How to use Hyperopt-Sklearn to automatically discover top-performing models for classification tasks.
How to use Hyperopt-Sklearn to automatically discover top-performing models for regression tasks.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

23 Responses to HyperOpt for Automated Machine Learning With Scikit-Learn

Harrison September 11, 2020 at 7:31 pm #

Very educative blog, thanks for sharing.

Reply
- Jason Brownlee September 12, 2020 at 6:07 am #
  
  Thanks!
  
  Reply
Zach September 12, 2020 at 8:59 am #

I’ve been coming across your work regularly for years. It’s often the best around. In grad school a few years ago, I used your work on building a neural network from scratch as a boiler plate. I was able to add quite a few features to it, but it was such a clean walkthrough that really helped me and my classmates learn how the entire process. Anyway, this post is particularly relevant as I’m researching how to use HyperOpt at work. I decided to post just to let you know how thankful I am, and I’ve even started my own blog! Thank you very much, keep up great work!

Reply
- Jason Brownlee September 12, 2020 at 1:18 pm #
  
  Thanks!
  
  Well done on your progress!
  
  Reply

Anthony The Koala September 17, 2020 at 10:57 pm #

Dear Dr Jason,
I am getting runtime errors when I ran the first of your programs.
* I pipped hyperopt
* I could not use the foillowing code because I don’t have access rights.

git clone git@github.com:hyperopt/hyperopt-sklearn.git
cd hyperopt-sklearn
sudo pip install .
cd ..

git clone git@github.com:hyperopt/hyperopt-sklearn.git

cd hyperopt-sklearn

sudo pip install .

cd ..

* Rather I installed github for windows and gitted the hypertop-sklearn.
* Went to my C:\Users\A\Documents\GitHub\hyperopt-sklearn and did
* pip install .
* successful installation.No problems.
* Despite successful installation running the first of your programs produced runtime errors
* THIS IS DESPITE including this in the code

import os
os.environ['OMP_NUM_THREADS'] = "1"

1 2	import os os.environ['OMP_NUM_THREADS'] = "1"

WARN: OMP_NUM_THREADS=None =>
... If you are using openblas if you are using openblas set OMP_NUM_THREAD
 risk subprocess calls hanging indefinitely
  0%|                                    | 0/1 [00:00
... If you are using openblas if you are using openblas set OMP_NUM_THREAD
 risk subprocess calls hanging indefinitely
Error.  nthreads cannot be larger than environment variable "NUMEXPR_MAX_T
  0%|                                    | 0/1 [00:00<?, ?trial/s, best lo
ob exception:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

  0%|                                    | 0/1 [00:00<?, ?trial/s, best lo
Traceback (most recent call last):
  File "", line 1, in 
  File "c:\python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "c:\python38\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "c:\python38\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "c:\python38\lib\multiprocessing\spawn.py", line 287, in _fixup_mai
_path
    main_content = runpy.run_path(main_path,
  File "c:\python38\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "c:\python38\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "c:\python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Python38\test3.py", line 25, in 
    model.fit(X_train, y_train)
  File "c:\python38\lib\site-packages\hpsklearn\estimator.py", line 787, i
    fit_iter.send(increment)
  File "c:\python38\lib\site-packages\hpsklearn\estimator.py", line 688, i
iter
    hyperopt.fmin(fn_with_timeout,
  File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 469, in fmin
    return trials.fmin(
  File "c:\python38\lib\site-packages\hyperopt\base.py", line 671, in fmin
    return fmin(
  File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 509, in fmin
    rval.exhaust()
  File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 330, in exha
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 286, in run
    self.serial_evaluate()
  File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 165, in seri
luate
    result = self.domain.evaluate(spec, ctrl)
  File "c:\python38\lib\site-packages\hyperopt\base.py", line 894, in eval
    rval = self.fn(pyll_rval)
  File "c:\python38\lib\site-packages\hpsklearn\estimator.py", line 645, i
ith_timeout
    th.start()
  File "c:\python38\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "c:\python38\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "c:\python38\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "c:\python38\lib\multiprocessing\popen_spawn_win32.py", line 45, in
t__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "c:\python38\lib\multiprocessing\spawn.py", line 154, in get_prepar
data
    _check_not_importing_main()
  File "c:\python38\lib\multiprocessing\spawn.py", line 134, in _check_not
ting_main
    raise RuntimeError('''
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

WARN: OMP_NUM_THREADS=None =>

... If you are using openblas if you are using openblas set OMP_NUM_THREAD

risk subprocess calls hanging indefinitely

0%| | 0/1 [00:00

... If you are using openblas if you are using openblas set OMP_NUM_THREAD

risk subprocess calls hanging indefinitely

Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_T

0%| | 0/1 [00:00<?, ?trial/s, best lo

ob exception:

An attempt has been made to start a new process before the

current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your

child processes and you have forgotten to use the proper idiom

in the main module:

if __name__ == '__main__':

freeze_support()

...

The "freeze_support()" line can be omitted if the program

is not going to be frozen to produce an executable.

0%| | 0/1 [00:00<?, ?trial/s, best lo

Traceback (most recent call last):

File "", line 1, in

File "c:\python38\lib\multiprocessing\spawn.py", line 116, in spawn_main

exitcode = _main(fd, parent_sentinel)

File "c:\python38\lib\multiprocessing\spawn.py", line 125, in _main

prepare(preparation_data)

File "c:\python38\lib\multiprocessing\spawn.py", line 236, in prepare

_fixup_main_from_path(data['init_main_from_path'])

File "c:\python38\lib\multiprocessing\spawn.py", line 287, in _fixup_mai

_path

main_content = runpy.run_path(main_path,

File "c:\python38\lib\runpy.py", line 265, in run_path

return _run_module_code(code, init_globals, run_name,

File "c:\python38\lib\runpy.py", line 97, in _run_module_code

_run_code(code, mod_globals, init_globals,

File "c:\python38\lib\runpy.py", line 87, in _run_code

exec(code, run_globals)

File "C:\Python38\test3.py", line 25, in

model.fit(X_train, y_train)

File "c:\python38\lib\site-packages\hpsklearn\estimator.py", line 787, i

fit_iter.send(increment)

File "c:\python38\lib\site-packages\hpsklearn\estimator.py", line 688, i

iter

hyperopt.fmin(fn_with_timeout,

File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 469, in fmin

return trials.fmin(

File "c:\python38\lib\site-packages\hyperopt\base.py", line 671, in fmin

return fmin(

File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 509, in fmin

rval.exhaust()

File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 330, in exha

self.run(self.max_evals - n_done, block_until_done=self.asynchronous)

File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 286, in run

self.serial_evaluate()

File "c:\python38\lib\site-packages\hyperopt\fmin.py", line 165, in seri

luate

result = self.domain.evaluate(spec, ctrl)

File "c:\python38\lib\site-packages\hyperopt\base.py", line 894, in eval

rval = self.fn(pyll_rval)

File "c:\python38\lib\site-packages\hpsklearn\estimator.py", line 645, i

ith_timeout

th.start()

File "c:\python38\lib\multiprocessing\process.py", line 121, in start

self._popen = self._Popen(self)

File "c:\python38\lib\multiprocessing\context.py", line 224, in _Popen

return _default_context.get_context().Process._Popen(process_obj)

File "c:\python38\lib\multiprocessing\context.py", line 327, in _Popen

return Popen(process_obj)

File "c:\python38\lib\multiprocessing\popen_spawn_win32.py", line 45, in

t__

prep_data = spawn.get_preparation_data(process_obj._name)

File "c:\python38\lib\multiprocessing\spawn.py", line 154, in get_prepar

data

_check_not_importing_main()

File "c:\python38\lib\multiprocessing\spawn.py", line 134, in _check_not

ting_main

raise RuntimeError('''

RuntimeError:

An attempt has been made to start a new process before the

current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your

child processes and you have forgotten to use the proper idiom

in the main module:

if __name__ == '__main__':

freeze_support()

...

The "freeze_support()" line can be omitted if the program

is not going to be frozen to produce an executable.

I don’t know why the program is producing runtime errors.

Note again: successful installation of pipping and gitting.

Thank you,
Anthony of Sydney

Jason Brownlee September 18, 2020 at 6:47 am #

I’m sorry to hear that Anthony. I don’t have experience with windows.

Perhaps try posting/searching on stackoverflow?

Reply

Anthony The Koala September 17, 2020 at 11:25 pm #

Dear Dr Jason,
I typed in the same program line-by-line in IDLE the Python IDE and NO RUNTIME ERRORS!!
I don’t know why there were no runtime errors. In the previous attempt, I simply copied the code from this page and ran it as a script.
This time I inputted the code line-by-line
Here is a sample of output:

..........
from hpsklearn import HyperoptEstimator, any_classifier
WARN: OMP_NUM_THREADS=None =>
... If you are using openblas if you are using openblas set OMP_NUM_THREADS=1 or risk subprocess calls hanging indefinitely
model.fit(X_train,y_train)

  0%|          | 0/1 [00:00<?, ?trial/s, best loss=?]
100%|██████████| 1/1 [00:07<00:00,  7.45s/trial, best loss: 0.2142857142857143]
100%|██████████| 1/1 [00:07<00:00,  7.46s/trial, best loss: 0.2142857142857143]

 50%|█████     | 1/2 [00:00<?, ?trial/s, best loss=?]
100%|██████████| 2/2 [00:07<00:00,  7.59s/trial, best loss: 0.1785714285714286]
100%|██████████| 2/2 [00:07<00:00,  7.65s/trial, best loss: 0.1785714285714286]

 67%|██████▋   | 2/3 [00:00<?, ?trial/s, best loss=?]
100%|██████████| 3/3 [00:07<00:00,  7.44s/trial, best loss: 0.1428571428571429]
100%|██████████| 3/3 [00:07<00:00,  7.47s/trial, best loss: 0.1428571428571429]
......................
.....................
 98%|█████████▊| 49/50 [00:00<?, ?trial/s, best loss=?]
100%|██████████| 50/50 [00:07<00:00,  7.77s/trial, best loss: 0.0714285714285714]
100%|██████████| 50/50 [00:07>> acc=model.score(X_test, y_test)
>>> acc
0.8405797101449275
>>> print(model.best_model())
{'learner': ExtraTreesClassifier(bootstrap=False, ccp_alpha=0.0, class_weight=None,
                     criterion='gini', max_depth=None,
                     max_features=0.4474405338213263, max_leaf_nodes=None,
                     max_samples=None, min_impurity_decrease=0.0,
                     min_impurity_split=None, min_samples_leaf=2,
                     min_samples_split=2, min_weight_fraction_leaf=0.0,
                     n_estimators=388, n_jobs=1, oob_score=False,
                     random_state=4, verbose=False, warm_start=False), 'preprocs': (Normalizer(copy=True, norm='l1'),), 'ex_preprocs': ()}

..........

from hpsklearn import HyperoptEstimator, any_classifier

WARN: OMP_NUM_THREADS=None =>

... If you are using openblas if you are using openblas set OMP_NUM_THREADS=1 or risk subprocess calls hanging indefinitely

model.fit(X_train,y_train)

0%| | 0/1 [00:00<?, ?trial/s, best loss=?]

100%|██████████| 1/1 [00:07<00:00, 7.45s/trial, best loss: 0.2142857142857143]

100%|██████████| 1/1 [00:07<00:00, 7.46s/trial, best loss: 0.2142857142857143]

50%|█████ | 1/2 [00:00<?, ?trial/s, best loss=?]

100%|██████████| 2/2 [00:07<00:00, 7.59s/trial, best loss: 0.1785714285714286]

100%|██████████| 2/2 [00:07<00:00, 7.65s/trial, best loss: 0.1785714285714286]

67%|██████▋ | 2/3 [00:00<?, ?trial/s, best loss=?]

100%|██████████| 3/3 [00:07<00:00, 7.44s/trial, best loss: 0.1428571428571429]

100%|██████████| 3/3 [00:07<00:00, 7.47s/trial, best loss: 0.1428571428571429]

......................

.....................

98%|█████████▊| 49/50 [00:00<?, ?trial/s, best loss=?]

100%|██████████| 50/50 [00:07<00:00, 7.77s/trial, best loss: 0.0714285714285714]

100%|██████████| 50/50 [00:07>> acc=model.score(X_test, y_test)

>>> acc

0.8405797101449275

>>> print(model.best_model())

{'learner': ExtraTreesClassifier(bootstrap=False, ccp_alpha=0.0, class_weight=None,

criterion='gini', max_depth=None,

max_features=0.4474405338213263, max_leaf_nodes=None,

max_samples=None, min_impurity_decrease=0.0,

min_impurity_split=None, min_samples_leaf=2,

min_samples_split=2, min_weight_fraction_leaf=0.0,

n_estimators=388, n_jobs=1, oob_score=False,

random_state=4, verbose=False, warm_start=False), 'preprocs': (Normalizer(copy=True, norm='l1'),), 'ex_preprocs': ()}

Strange that it worked.
But…..
Why did you get GradientBoostingClassfier at accuracy=0.855 and I ‘got’ ExtraTreessClassifier at accuracy=0.84

And I still don’t know why I got runtime errors in my first attempt but no runtime error this attempt.

Thank you,
Anthony of Sydney

Jason Brownlee September 18, 2020 at 6:48 am #

Well done!

Reply

Anthony The Koala September 18, 2020 at 12:14 pm #

Dear Dr Jason,
I extended the last few lines of the program starting with:

# summarize performance
mae = model.score(X_test, y_test)
print("MAE: %.3f" % mae)
# summarize the best model
print(model.best_model())
#Now find the predicted values of the test_x
predict_test = model.predict(X_test)
#Now find the correlation between predict_test and y_test
from numpy.stats import pearsonr
corr, p_value = pearsonr(predict_test,y_test)
corr, p_value
(0.9334779887384739, 2.1861642011348944e-75)
#Notice that R^2 = corr^2 = mae
mae
0.8710212877137651
mae**0.5
0.9332852124156715
#Now plot the data
import matplotlib.pyplot as plt
plt.scatter(y_test,predict_test)
plt.show()
#Note the strong correlation between the predicted y and test y

# summarize performance

mae = model.score(X_test, y_test)

print("MAE: %.3f" % mae)

# summarize the best model

print(model.best_model())

#Now find the predicted values of the test_x

predict_test = model.predict(X_test)

#Now find the correlation between predict_test and y_test

from numpy.stats import pearsonr

corr, p_value = pearsonr(predict_test,y_test)

corr, p_value

(0.9334779887384739, 2.1861642011348944e-75)

#Notice that R^2 = corr^2 = mae

mae

0.8710212877137651

mae**0.5

0.9332852124156715

#Now plot the data

import matplotlib.pyplot as plt

plt.scatter(y_test,predict_test)

plt.show()

#Note the strong correlation between the predicted y and test y

Conclusion:
Correlation between predicted y and test_y is 0.933, with p value << sig_value where sig_value = 0.05.
There is a strong linear relation between predicted y and test_y of mae=R^2 = 0.871

Thank you,
Anthony of Sydney

Jason Brownlee September 18, 2020 at 2:49 pm #

I should hope so. Good test!

Reply

Cecile S October 21, 2020 at 8:58 pm #

Dear Jason,

Thank you for this very useful post! I am testing hyperopt sklearn for a classification problem, where I want to optimize the balanced accuracy. How can I define the loss_fn in this case ? Or more generally, how can I define the loss_fn given a specific sklearn metric scoring (balanced accuracy, f1, recall, …)?

Thank you in advance!

Reply
- Jason Brownlee October 22, 2020 at 6:42 am #
  
  You specify the name of the function you want to minimize to the loss_fn argument.
  
  You can use the complement or reciprocal of a maximising score, e.g. 1 / score
  
  More details here:
  https://github.com/hyperopt/hyperopt-sklearn/blob/master/hpsklearn/estimator.py#L429
  
  Reply
Michael December 4, 2020 at 10:38 am #

Great post. I have been working with the base hyperopt library for a year or so across elements of the machine learning workflow (data processing, hyper-parameter selection) but did not realise that through sk-learn it could also be used for model selection as well. There are some old projects that I will now resuscitate to see how they perform with this tweak. Thankyou!

Reply
- Jason Brownlee December 4, 2020 at 1:21 pm #
  
  Thanks!
  
  You’re very welcome, good luck.
  
  Reply
Shrey Jain July 9, 2021 at 3:34 pm #

Hi Jason, nice blog! One question though. You have used Hyperopt for selecting the best model. Can we use Hyperopt for Hyperparameter tuning as well as selecting the best model? That would mean much less lines of codes and overall good performance.

Reply
- Jason Brownlee July 10, 2021 at 6:06 am #
  
  Yes, I believe so.
  
  Reply
Marco Cerliani December 28, 2021 at 1:33 am #

I suggest you shap-hypetune to industrialize parameter tuning (and also feature selection) with xgboost and hyperopt (https://github.com/cerlymarco/shap-hypetune)

Reply
- James Carmichael January 10, 2022 at 11:21 am #
  
  Thank you for the feedback Marco!
  
  Reply
Sam April 20, 2022 at 11:57 pm #

Hello Jason,

I am in the process of finding the “best” hyperparamters for my SARIMA statistical model: p,d,q,P,D,Q, and s.

I started off with a “brute force” grid search in which I used nested for loops to try a range of the parameters and record the mean absolute percentage error resulting from each model and finally printing the model which returned the least error. This process however is taking a long running time due to the ranges I set for the hyperparamters.

Hence I would like to use an optimization algorithm such as Hyperopt.

My question is:
– Is it possible to use this library for SARIMA hyperparameter optimization?
– Did you perhaps do such a thing before or have a blog about it?
– If it is not possible to use Hyperopt for this purpose, are there any other libraries you might suggest?

Thank you for your efforts!

Reply
- James Carmichael April 21, 2022 at 8:56 am #
  
  Hi Sam…Hyperopt can be used for this purpose:
  
  https://medium.com/district-data-labs/parameter-tuning-with-hyperopt-faa86acdfdce
  
  Reply
Ari January 6, 2023 at 5:50 am #

Dear Jason – Within a Conda environment (based on Python 3.9 and TensorFlow 2.11), when I try the Classification example above, I’m getting a ‘base_estimator’ error. I’m a bit confused and would appreciate your thoughts. Thanks.

100%|██████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.48s/trial, best loss: 0.3928571428571429]
100%|██████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.50s/trial, best loss: 0.3214285714285714]
100%|██████████████████████████████████████████████████| 3/3 [00:01<00:00, 1.65s/trial, best loss: 0.2142857142857143]
100%|██████████████████████████████████████████████████| 4/4 [00:01<00:00, 1.41s/trial, best loss: 0.2142857142857143]
100%|██████████████████████████████████████████████████| 5/5 [00:01<00:00, 1.52s/trial, best loss: 0.2142857142857143]
83%|███████████████████████████████████████████████████████████████▎ | 5/6 [00:00<?, ?trial/s, best loss=?]
job exception: The 'base_estimator' parameter of AdaBoostClassifier must be an object implementing 'fit' and 'predict' or a str among {'deprecated'}. Got None instead.

83%|███████████████████████████████████████████████████████████████▎ | 5/6 [00:01 24 model.fit(X_train, y_train)
25 # summarize performance
26 acc = model.score(X_test, y_test)

File ~\hyperopt-sklearn\hpsklearn\estimator\estimator.py:464, in hyperopt_estimator.fit(self, X, y, EX_list, valid_size, n_folds, cv_shuffle, warm_start, random_state)
461 try:
462 increment = min(self.fit_increment,
463 adjusted_max_evals – len(self.trials.trials))
–> 464 fit_iter.send(increment)
466 if self.fit_increment_dump_filename is not None:
467 with open(self.fit_increment_dump_filename, “wb”) as dump_file:

File ~\hyperopt-sklearn\hpsklearn\estimator\estimator.py:339, in hyperopt_estimator.fit_iter(self, X, y, EX_list, valid_size, n_folds, cv_shuffle, warm_start, random_state)
337 # Workaround for rstate issue #35
338 if “rstate” in inspect.getfullargspec(hyperopt.fmin).args:
–> 339 hyperopt.fmin(_fn_with_timeout,
340 space=self.space,
341 algo=self.algo,
342 trials=self.trials,
343 max_evals=len(self.trials.trials) + increment,
344 # — let exceptions crash the program, so we notice them.
345 catch_eval_exceptions=False,
346 return_argmin=False) # — in case no success so far
347 else:
348 if self.seed is None:

File ~\.conda\envs\py39tf211\lib\site-packages\hyperopt\fmin.py:540, in fmin(fn, space, algo, max_evals, timeout, loss_threshold, trials, rstate, allow_trials_fmin, pass_expr_memo_ctrl, catch_eval_exceptions, verbose, return_argmin, points_to_evaluate, max_queue_len, show_progressbar, early_stop_fn, trials_save_file)
537 fn = __objective_fmin_wrapper(fn)
539 if allow_trials_fmin and hasattr(trials, “fmin”):
–> 540 return trials.fmin(
541 fn,
542 space,
543 algo=algo,
544 max_evals=max_evals,
545 timeout=timeout,
546 loss_threshold=loss_threshold,
547 max_queue_len=max_queue_len,
548 rstate=rstate,
549 pass_expr_memo_ctrl=pass_expr_memo_ctrl,
550 verbose=verbose,
551 catch_eval_exceptions=catch_eval_exceptions,
552 return_argmin=return_argmin,
553 show_progressbar=show_progressbar,
554 early_stop_fn=early_stop_fn,
555 trials_save_file=trials_save_file,
556 )
558 if trials is None:
559 if os.path.exists(trials_save_file):

File ~\.conda\envs\py39tf211\lib\site-packages\hyperopt\base.py:671, in Trials.fmin(self, fn, space, algo, max_evals, timeout, loss_threshold, max_queue_len, rstate, verbose, pass_expr_memo_ctrl, catch_eval_exceptions, return_argmin, show_progressbar, early_stop_fn, trials_save_file)
666 # — Stop-gap implementation!
667 # fmin should have been a Trials method in the first place
668 # but for now it’s still sitting in another file.
669 from .fmin import fmin
–> 671 return fmin(
672 fn,
673 space,
674 algo=algo,
675 max_evals=max_evals,
676 timeout=timeout,
677 loss_threshold=loss_threshold,
678 trials=self,
679 rstate=rstate,
680 verbose=verbose,
681 max_queue_len=max_queue_len,
682 allow_trials_fmin=False, # — prevent recursion
683 pass_expr_memo_ctrl=pass_expr_memo_ctrl,
684 catch_eval_exceptions=catch_eval_exceptions,
685 return_argmin=return_argmin,
686 show_progressbar=show_progressbar,
687 early_stop_fn=early_stop_fn,
688 trials_save_file=trials_save_file,
689 )

File ~\.conda\envs\py39tf211\lib\site-packages\hyperopt\fmin.py:586, in fmin(fn, space, algo, max_evals, timeout, loss_threshold, trials, rstate, allow_trials_fmin, pass_expr_memo_ctrl, catch_eval_exceptions, verbose, return_argmin, points_to_evaluate, max_queue_len, show_progressbar, early_stop_fn, trials_save_file)
583 rval.catch_eval_exceptions = catch_eval_exceptions
585 # next line is where the fmin is actually executed
–> 586 rval.exhaust()
588 if return_argmin:
589 if len(trials.trials) == 0:

File ~\.conda\envs\py39tf211\lib\site-packages\hyperopt\fmin.py:364, in FMinIter.exhaust(self)
362 def exhaust(self):
363 n_done = len(self.trials)
–> 364 self.run(self.max_evals – n_done, block_until_done=self.asynchronous)
365 self.trials.refresh()
366 return self

File ~\.conda\envs\py39tf211\lib\site-packages\hyperopt\fmin.py:300, in FMinIter.run(self, N, block_until_done)
297 time.sleep(self.poll_interval_secs)
298 else:
299 # — loop over trials and do the jobs directly
–> 300 self.serial_evaluate()
302 self.trials.refresh()
303 if self.trials_save_file != “”:

File ~\.conda\envs\py39tf211\lib\site-packages\hyperopt\fmin.py:178, in FMinIter.serial_evaluate(self, N)
176 ctrl = base.Ctrl(self.trials, current_trial=trial)
177 try:
–> 178 result = self.domain.evaluate(spec, ctrl)
179 except Exception as e:
180 logger.error(“job exception: %s” % str(e))

File ~\.conda\envs\py39tf211\lib\site-packages\hyperopt\base.py:892, in Domain.evaluate(self, config, ctrl, attach_attachments)
883 else:
884 # — the “work” of evaluating config can be written
885 # either into the pyll part (self.expr)
886 # or the normal Python part (self.fn)
887 pyll_rval = pyll.rec_eval(
888 self.expr,
889 memo=memo,
890 print_node_on_error=self.rec_eval_print_node_on_error,
891 )
–> 892 rval = self.fn(pyll_rval)
894 if isinstance(rval, (float, int, np.number)):
895 dict_rval = {“loss”: float(rval), “status”: STATUS_OK}

File ~\hyperopt-sklearn\hpsklearn\estimator\estimator.py:311, in hyperopt_estimator.fit_iter.._fn_with_timeout(*args, **kwargs)
309 assert fn_rval[0] in (“raise”, “return”)
310 if fn_rval[0] == “raise”:
–> 311 raise fn_rval[1]
313 # — remove potentially large objects from the rval
314 # so that the Trials() object below stays small
315 # We can recompute them if necessary, and it’s usually
316 # not necessary at all.
317 if fn_rval[1][“status”] == hyperopt.STATUS_OK:

InvalidParameterError: The ‘base_estimator’ parameter of AdaBoostClassifier must be an object implementing ‘fit’ and ‘predict’ or a str among {‘deprecated’}. Got None instead.

Reply
- James Carmichael January 6, 2023 at 8:10 am #
  
  Hi Ari…Please narrow your query to a single question so that we may better assist you.
  
  Reply
  - Ari January 6, 2023 at 3:01 pm #
    
    Hi James, My apologies for the confusion. It’s actually just one question. I’m getting the ‘base_estimator’ error that I have posted. Thanks.
    
    Reply

Navigation

HyperOpt for Automated Machine Learning With Scikit-Learn

Tutorial Overview

HyperOpt and HyperOpt-Sklearn

How to Install and Use HyperOpt-Sklearn

HyperOpt-Sklearn for Classification

HyperOpt-Sklearn for Regression

Further Reading

Summary

Discover Fast Machine Learning in Python!

Develop Your Own Models in Minutes

Finally Bring Machine Learning To
Your Own Projects

More On This Topic

23 Responses to HyperOpt for Automated Machine Learning With Scikit-Learn

Leave a Reply Click here to cancel reply.

Navigation

Tutorial Overview

HyperOpt and HyperOpt-Sklearn

How to Install and Use HyperOpt-Sklearn

HyperOpt-Sklearn for Classification

HyperOpt-Sklearn for Regression

Further Reading

Summary

Discover Fast Machine Learning in Python!

Develop Your Own Models in Minutes

Finally Bring Machine Learning To Your Own Projects

More On This Topic

23 Responses to HyperOpt for Automated Machine Learning With Scikit-Learn

Leave a Reply Click here to cancel reply.

Finally Bring Machine Learning To
Your Own Projects