Last Updated on August 27, 2020

XGBoost is an efficient implementation of gradient boosting for classification and regression problems.

It is both fast and efficient, performing well, if not the best, on a wide range of predictive modeling tasks and is a favorite among data science competition winners, such as those on Kaggle.

XGBoost can also be used for time series forecasting, although it requires that the time series dataset be transformed into a supervised learning problem first. It also requires the use of a specialized technique for evaluating the model called walk-forward validation, as evaluating the model using k-fold cross validation would result in optimistically biased results.

In this tutorial, you will discover how to develop an XGBoost model for time series forecasting.

After completing this tutorial, you will know:

- XGBoost is an implementation of the gradient boosting ensemble algorithm for classification and regression.
- Time series datasets can be transformed into supervised learning using a sliding-window representation.
- How to fit, evaluate, and make predictions with an XGBoost model for time series forecasting.

**Kick-start your project** with my new book XGBoost With Python, including *step-by-step tutorials* and the *Python source code* files for all examples.

Let’s get started.

**Update Aug/2020**: Fixed bug in the calculation of MAE, updated model config to make better predictions (thanks Kaustav!)

## Tutorial Overview

This tutorial is divided into three parts; they are:

- XGBoost Ensemble
- Time Series Data Preparation
- XGBoost for Time Series Forecasting

## XGBoost Ensemble

XGBoost is short for Extreme Gradient Boosting and is an efficient implementation of the stochastic gradient boosting machine learning algorithm.

The stochastic gradient boosting algorithm, also called gradient boosting machines or tree boosting, is a powerful machine learning technique that performs well or even best on a wide range of challenging machine learning problems.

Tree boosting has been shown to give state-of-the-art results on many standard classification benchmarks.

— XGBoost: A Scalable Tree Boosting System, 2016.

It is an ensemble of decision trees algorithm where new trees fix errors of those trees that are already part of the model. Trees are added until no further improvements can be made to the model.

XGBoost provides a highly efficient implementation of the stochastic gradient boosting algorithm and access to a suite of model hyperparameters designed to provide control over the model training process.

The most important factor behind the success of XGBoost is its scalability in all scenarios. The system runs more than ten times faster than existing popular solutions on a single machine and scales to billions of examples in distributed or memory-limited settings.

— XGBoost: A Scalable Tree Boosting System, 2016.

XGBoost is designed for classification and regression on tabular datasets, although it can be used for time series forecasting.

For more on the gradient boosting and XGBoost implementation, see the tutorial:

First, the XGBoost library must be installed.

You can install it using pip, as follows:

1 |
sudo pip install xgboost |

Once installed, you can confirm that it was installed successfully and that you are using a modern version by running the following code:

1 2 3 |
# xgboost import xgboost print("xgboost", xgboost.__version__) |

Running the code, you should see the following version number or higher.

1 |
xgboost 1.0.1 |

Although the XGBoost library has its own Python API, we can use XGBoost models with the scikit-learn API via the XGBRegressor wrapper class.

An instance of the model can be instantiated and used just like any other scikit-learn class for model evaluation. For example:

1 2 3 |
... # define model model = XGBRegressor() |

Now that we are familiar with XGBoost, let’s look at how we can prepare a time series dataset for supervised learning.

## Time Series Data Preparation

Time series data can be phrased as supervised learning.

Given a sequence of numbers for a time series dataset, we can restructure the data to look like a supervised learning problem. We can do this by using previous time steps as input variables and use the next time step as the output variable.

Let’s make this concrete with an example. Imagine we have a time series as follows:

1 2 3 4 5 6 |
time, measure 1, 100 2, 110 3, 108 4, 115 5, 120 |

We can restructure this time series dataset as a supervised learning problem by using the value at the previous time step to predict the value at the next time-step.

Reorganizing the time series dataset this way, the data would look as follows:

1 2 3 4 5 6 7 |
X, y ?, 100 100, 110 110, 108 108, 115 115, 120 120, ? |

Note that the time column is dropped and some rows of data are unusable for training a model, such as the first and the last.

This representation is called a sliding window, as the window of inputs and expected outputs is shifted forward through time to create new “*samples*” for a supervised learning model.

For more on the sliding window approach to preparing time series forecasting data, see the tutorial:

We can use the shift() function in Pandas to automatically create new framings of time series problems given the desired length of input and output sequences.

This would be a useful tool as it would allow us to explore different framings of a time series problem with machine learning algorithms to see which might result in better-performing models.

The function below will take a time series as a NumPy array time series with one or more columns and transform it into a supervised learning problem with the specified number of inputs and outputs.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# transform a time series dataset into a supervised learning dataset def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols = list() # input sequence (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) # forecast sequence (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) # put it all together agg = concat(cols, axis=1) # drop rows with NaN values if dropnan: agg.dropna(inplace=True) return agg.values |

We can use this function to prepare a time series dataset for XGBoost.

For more on the step-by-step development of this function, see the tutorial:

Once the dataset is prepared, we must be careful in how it is used to fit and evaluate a model.

For example, it would not be valid to fit the model on data from the future and have it predict the past. The model must be trained on the past and predict the future.

This means that methods that randomize the dataset during evaluation, like k-fold cross-validation, cannot be used. Instead, we must use a technique called walk-forward validation.

In walk-forward validation, the dataset is first split into train and test sets by selecting a cut point, e.g. all data except the last 12 months is used for training and the last 12 months is used for testing.

If we are interested in making a one-step forecast, e.g. one month, then we can evaluate the model by training on the training dataset and predicting the first step in the test dataset. We can then add the real observation from the test set to the training dataset, refit the model, then have the model predict the second step in the test dataset.

Repeating this process for the entire test dataset will give a one-step prediction for the entire test dataset from which an error measure can be calculated to evaluate the skill of the model.

For more on walk-forward validation, see the tutorial:

The function below performs walk-forward validation.

It takes the entire supervised learning version of the time series dataset and the number of rows to use as the test set as arguments.

It then steps through the test set, calling the *xgboost_forecast()* function to make a one-step forecast. An error measure is calculated and the details are returned for analysis.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# walk-forward validation for univariate data def walk_forward_validation(data, n_test): predictions = list() # split dataset train, test = train_test_split(data, n_test) # seed history with training dataset history = [x for x in train] # step over each time-step in the test set for i in range(len(test)): # split test row into input and output columns testX, testy = test[i, :-1], test[i, -1] # fit model on history and make a prediction yhat = xgboost_forecast(history, testX) # store forecast in list of predictions predictions.append(yhat) # add actual observation to history for the next loop history.append(test[i]) # summarize progress print('>expected=%.1f, predicted=%.1f' % (testy, yhat)) # estimate prediction error error = mean_absolute_error(test[:, -1], predictions) return error, test[:, 1], predictions |

The *train_test_split()* function is called to split the dataset into train and test sets.

We can define this function below.

1 2 3 |
# split a univariate dataset into train/test sets def train_test_split(data, n_test): return data[:-n_test, :], data[-n_test:, :] |

We can use the *XGBRegressor* class to make a one-step forecast.

The *xgboost_forecast()* function below implements this, taking the training dataset and test input row as input, fitting a model, and making a one-step prediction.

1 2 3 4 5 6 7 8 9 10 11 12 |
# fit an xgboost model and make a one step prediction def xgboost_forecast(train, testX): # transform list into array train = asarray(train) # split into input and output columns trainX, trainy = train[:, :-1], train[:, -1] # fit model model = XGBRegressor(objective='reg:squarederror', n_estimators=1000) model.fit(trainX, trainy) # make a one-step prediction yhat = model.predict([testX]) return yhat[0] |

Now that we know how to prepare time series data for forecasting and evaluate an XGBoost model, next we can look at using XGBoost on a real dataset.

## XGBoost for Time Series Forecasting

In this section, we will explore how to use XGBoost for time series forecasting.

We will use a standard univariate time series dataset with the intent of using the model to make a one-step forecast.

You can use the code in this section as the starting point in your own project and easily adapt it for multivariate inputs, multivariate forecasts, and multi-step forecasts.

We will use the daily female births dataset, that is the monthly births across three years.

You can download the dataset from here, place it in your current working directory with the filename “*daily-total-female-births.csv*“.

The first few lines of the dataset look as follows:

1 2 3 4 5 6 7 |
"Date","Births" "1959-01-01",35 "1959-01-02",32 "1959-01-03",30 "1959-01-04",31 "1959-01-05",44 ... |

First, let’s load and plot the dataset.

The complete example is listed below.

1 2 3 4 5 6 7 8 9 |
# load and plot the time series dataset from pandas import read_csv from matplotlib import pyplot # load dataset series = read_csv('daily-total-female-births.csv', header=0, index_col=0) values = series.values # plot dataset pyplot.plot(values) pyplot.show() |

Running the example creates a line plot of the dataset.

We can see there is no obvious trend or seasonality.

A persistence model can achieve a MAE of about 6.7 births when predicting the last 12 months. This provides a baseline in performance above which a model may be considered skillful.

Next, we can evaluate the XGBoost model on the dataset when making one-step forecasts for the last 12 months of data.

We will use only the previous 6 time steps as input to the model and default model hyperparameters, except we will change the loss to ‘*reg:squarederror*‘ (to avoid a warning message) and use a 1,000 trees in the ensemble (to avoid underlearning).

The complete example is listed below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# forecast monthly births with xgboost from numpy import asarray from pandas import read_csv from pandas import DataFrame from pandas import concat from sklearn.metrics import mean_absolute_error from xgboost import XGBRegressor from matplotlib import pyplot # transform a time series dataset into a supervised learning dataset def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols = list() # input sequence (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) # forecast sequence (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) # put it all together agg = concat(cols, axis=1) # drop rows with NaN values if dropnan: agg.dropna(inplace=True) return agg.values # split a univariate dataset into train/test sets def train_test_split(data, n_test): return data[:-n_test, :], data[-n_test:, :] # fit an xgboost model and make a one step prediction def xgboost_forecast(train, testX): # transform list into array train = asarray(train) # split into input and output columns trainX, trainy = train[:, :-1], train[:, -1] # fit model model = XGBRegressor(objective='reg:squarederror', n_estimators=1000) model.fit(trainX, trainy) # make a one-step prediction yhat = model.predict(asarray([testX])) return yhat[0] # walk-forward validation for univariate data def walk_forward_validation(data, n_test): predictions = list() # split dataset train, test = train_test_split(data, n_test) # seed history with training dataset history = [x for x in train] # step over each time-step in the test set for i in range(len(test)): # split test row into input and output columns testX, testy = test[i, :-1], test[i, -1] # fit model on history and make a prediction yhat = xgboost_forecast(history, testX) # store forecast in list of predictions predictions.append(yhat) # add actual observation to history for the next loop history.append(test[i]) # summarize progress print('>expected=%.1f, predicted=%.1f' % (testy, yhat)) # estimate prediction error error = mean_absolute_error(test[:, -1], predictions) return error, test[:, -1], predictions # load the dataset series = read_csv('daily-total-female-births.csv', header=0, index_col=0) values = series.values # transform the time series data into supervised learning data = series_to_supervised(values, n_in=6) # evaluate mae, y, yhat = walk_forward_validation(data, 12) print('MAE: %.3f' % mae) # plot expected vs preducted pyplot.plot(y, label='Expected') pyplot.plot(yhat, label='Predicted') pyplot.legend() pyplot.show() |

Running the example reports the expected and predicted values for each step in the test set, then the MAE for all predicted values.

**Note**: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

We can see that the model performs better than a persistence model, achieving a MAE of about 5.9 births, compared to 6.7 births.

**Can you do better?**

You can test different XGBoost hyperparameters and numbers of time steps as input to see if you can achieve better performance. Share your results in the comments below.

1 2 3 4 5 6 7 8 9 10 11 12 13 |
>expected=42.0, predicted=44.5 >expected=53.0, predicted=42.5 >expected=39.0, predicted=40.3 >expected=40.0, predicted=32.5 >expected=38.0, predicted=41.1 >expected=44.0, predicted=45.3 >expected=34.0, predicted=40.2 >expected=37.0, predicted=35.0 >expected=52.0, predicted=32.5 >expected=48.0, predicted=41.4 >expected=55.0, predicted=46.6 >expected=50.0, predicted=47.2 MAE: 5.957 |

A line plot is created comparing the series of expected values and predicted values for the last 12 months of the dataset.

This gives a geometric interpretation of how well the model performed on the test set.

Once a final XGBoost model configuration is chosen, a model can be finalized and used to make a prediction on new data.

This is called an **out-of-sample forecast**, e.g. predicting beyond the training dataset. This is identical to making a prediction during the evaluation of the model: as we always want to evaluate a model using the same procedure that we expect to use when the model is used to make prediction on new data.

The example below demonstrates fitting a final XGBoost model on all available data and making a one-step prediction beyond the end of the dataset.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
# finalize model and make a prediction for monthly births with xgboost from numpy import asarray from pandas import read_csv from pandas import DataFrame from pandas import concat from xgboost import XGBRegressor # transform a time series dataset into a supervised learning dataset def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols = list() # input sequence (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) # forecast sequence (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) # put it all together agg = concat(cols, axis=1) # drop rows with NaN values if dropnan: agg.dropna(inplace=True) return agg.values # load the dataset series = read_csv('daily-total-female-births.csv', header=0, index_col=0) values = series.values # transform the time series data into supervised learning train = series_to_supervised(values, n_in=6) # split into input and output columns trainX, trainy = train[:, :-1], train[:, -1] # fit model model = XGBRegressor(objective='reg:squarederror', n_estimators=1000) model.fit(trainX, trainy) # construct an input for a new preduction row = values[-6:].flatten() # make a one-step prediction yhat = model.predict(asarray([row])) print('Input: %s, Predicted: %.3f' % (row, yhat[0])) |

Running the example fits an XGBoost model on all available data.

A new row of input is prepared using the last 6 months of known data and the next month beyond the end of the dataset is predicted.

1 |
Input: [34 37 52 48 55 50], Predicted: 42.708 |

## Further Reading

This section provides more resources on the topic if you are looking to go deeper.

### Related Tutorials

- A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning
- Time Series Forecasting as Supervised Learning
- How to Convert a Time Series to a Supervised Learning Problem in Python
- How To Backtest Machine Learning Models for Time Series Forecasting

## Summary

In this tutorial, you discovered how to develop an XGBoost model for time series forecasting.

Specifically, you learned:

- XGBoost is an implementation of the gradient boosting ensemble algorithm for classification and regression.
- Time series datasets can be transformed into supervised learning using a sliding-window representation.
- How to fit, evaluate, and make predictions with an XGBoost model for time series forecasting.

**Do you have any questions?**

Ask your questions in the comments below and I will do my best to answer.

The result does not really look convincing

Fair enough.

Consider the model a template that you can apply on your own projects.

Excellent explained very nicely

Keep it up

Thanks!

This article doesn’t make a cogent argument for using XGBoost for time-series or time dependent data.

Without any sort of weighting based on time, the algorithm has no way of knowing how to incorporate time – it just looks at isolated points e.g. A yields 400, B yields 510 with no chronological relationship between A and B. The expected vs predicted graph you show clearly indicates that the model fails to establish a proper relationship between time and predictions.

I’ve tried to implement XGBoost in financial forecasting with 2 years historical data, it just doesn’t work well. Sometimes you can get better accuracies with ensembling techniques, but nothing really beats a true time series model. In that case, I’d use the pmdarima package and the auto.arima function is fantastic.

I get that you could use this as an example template, but I think it’s not really instructional until you measure this against a time-series model or apply some sort of time weights to non-time series models to get a clear idea of what options exist.

At worst, this article is misleading. At best, it’s flawed and requires more testing and examples.

I appreciate you putting this out there because it brings up some good questions on how to approach time series problems with some more flexibility, I’d look forward to a more thorough article on this topic.

Thanks for sharing, sorry it does not work for your specific datasets.

I disagree that it is misleading.

I agree with Rahul, in that this does not seem to account for things that time-series models are designed to address, such as seasonality, whether the data is stationary or not, etc.

Sure, only try it on your data if you think it offers some benefit over other methods.

You can engineer some new features that will potentially account for seasonality if you are creative enough.

Agreed!

Dear Dr Jason,

I have a question on single variable data such as “sunspot” data. There are no X values or features. It is called “univariate” as shown in your blog https://machinelearningmastery.com/time-series-datasets-for-machine-learning/. Yes univariate datasets have date information as in dd-mm-yyyy info or it could be derived by an array of x = [i for i in range(len(mydataset))].

Can model evaluation such as train-test-splitting, model, RepeatedClassifiedKFold and cross_val_score be performed on univariate time series with time a feature of X and y the univariate data series, for example sunspot data?

Thank you,

Anthony of Sydney

It is univariate. Date/times are dropped.

Cross-validation is generally invalid for time series data, see this:

https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/

Dear Dr Jason,

Thank you for averting to the sitehttps://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/.

Despite no cross-validation, nevertheless train/test split is still performed and the sunspot data is used as an example dataset to “experiment” with.

Thank you, it is appreciated.

Anthony of Sydney

Obviously, if you have the birth rate from 1960 and from then on, we can really test how good the model is.

Also, since the time is left out, you cannot treat gappy time series, as often happens for natural phemonena like variable star research.

Thanks for your note.

Hi Jason! do you think we could have a multivariate variation of this?

Great suggestion!

Do you need to detrend and deseasonalize the data when using XGBoost?

Depends. Try with and without and compare the results.

Hi Jason, can the code be modified to make more than a one step prediction at the end of the dataset? If so any tips 🙂

For example could I go further out than

`yhat = model.predict(asarray([row]))`

? I am also running my own dataset, one month of a building electricity usage (kW) on 15 minute intervals… My results are pretty good for the expected & predicted plots.I was just curious about being able to predict more data than 1 minute 15 data point.. Thanks

Good question.

Yes, you could use one of the following wrapper classes:

https://machinelearningmastery.com/multi-output-regression-models-with-python/

Jason thanks for the additional info… For time series application (as I mentioned electricity dataset) would you recommend going for Chained Multioutput Regression tactic? OR the Direct Multioutput Regression as mentioned in the link you sent?

Perhaps test both and discover what works well.

Jason one other question. If I create some plots with the code for expected & predicted analysis. Can I save this model in like a pickle to use on the prediction code? OR would the models be the same parameters between the

and# forecast monthly births with xgboost

`# forecast monthly births with xgboost`

scripts?I believe you can pickle an xgboost model. Perhaps test it to confirm.

Wondering why have you returned test[:,1] in the walk_forward_validation() function, and why is that being used to calculate error? Shouldnt it be test[:,-1]? We are predicting the last column right, hence it should be compared with the last column

You’re right, looks like a typo.

Fixed. Thanks!

Even what you’re returning should be corrected then right?

Correct!

No idea what I was thinking. More coffee is needed…

Thanks for pointing out these dumb errors.

Hi Jason. Thanks for the material.

I think there is an error on error calculus:

mean_absolute_error(test[:, -1], predictions)

If you pay attention on “test[:, -1]” you is notice the array isn’t align with the correct values.

I’m right?

Sorry. I made a confusion. It’s ok!

No problem.

I believe the code is correct.

It is common for models to mostly forecast the previous value as the next value, called a persistence forecast. When plotted, it looks like the forecast is one step behind the observations.

Hi Jason. Thanks for the material. After reading your explanation about xgboost, I want to try to use this method for time series forecasting. You mentioned in the article that this method can be extended to multivariate input. I want to use several parameters to predict the cyclical trend of another correlated parameter. How should I adjust the existing method?

You’re welcome.

The same function for preparing the data can be used directly I believe. Try it and see.

Hi Jason.

Thanks for providing helpful tutorials. I am subscribing your super bundle package, and all of them are very useful for self training.

Wonder if you have solutions for multivariate, multi-timesteps forecast using XGBoost. I could not find if in your book, xgboost_with_python. Thanks !

Thanks!

Good question. I don’t have examples of time series in the xgboost book.

You can adapt the above example to work with multivariate time series data directly. Multi-step could be achieved if the xgboost supports multiple output directly (I think it does) or if you use a multi output regression wrapper class from here:

https://machinelearningmastery.com/multi-output-regression-models-with-python/

Thank you for such a great resource!

Why are you using XGBoost, not Random Forest, for example? Does XGBoost work better for such kind of tasks? If yes, why?

No reason other than many people asked me how to use xgboost for time series.

Hi Jason, thanks for the tutorial . What I want to ask is that each time when I make a prediction for a new data in real case, I need to transform it into supervised learning datasets with all the history data. Is that correct? In my understanding, only this way can get the lag information for the new data. If it’s true, how to improve the performance if I have big volume of history data? Every time I make a prediction, I need to shift all the history data again. Especially if I have multivariate, that would be time consuming. Finding a solution for that case. Thanks

Yes.

Test different amounts of lag, different data transforms, different model configs in order to discover what works best for your dataset.

Hi Jason,

Just to verify/clarify, when the chart is made, “expected” == “actual” data, right? We are comparing predicted values to actual/expected data…

I just wanted to verify the words that expected means actual real data.

It is expected vs predicted, expected is the data in the dataset.

Hi Jason,

Could I use this

`walk_forward_validation`

you demonstrated that predicts one future value with MultiOutputRegressor ?Ultimately I want to predict multiple future values with something like:

`MultiOutputRegressor(XGBRegressor(objective='reg:squarederror', n_estimators=1000))`

Would I need to modify my

`walk_forward_validation`

to reflect my`MultiOutputRegressor`

process that is ultimately my end goal?My

`XGBRegressor(objective='reg:squarederror', n_estimators=1000)`

is the exact same used in my`walk_forward_validation`

and as with the sci kit learn wrapper`MultiOutputRegressor`

that predicts 24 future values. Curious if this matters at all or if the`walk_forward_validation`

would be meaning less because I am using`MultiOutputRegressor`

Hopefully that makes sense! Thank you so much for your posts, the results using this process have been much better than ARIMA or LSTM methods..

Maybe. You may have to experiment.

Jason,

Is there anything wrong with using a neural network with

`MultiOutputRegressor`

?Ive been experimenting with the sci kit learn

`MLPRegressor`

`model = MLPRegressor(max_iter=2000, shuffle=False)`

`multi_output_regr = MultiOutputRegressor(model)`

The thought ran across my head to ask since I notice that all NN for time series forecasting seems to be all about a pattern recognition like LSTM…

I don’t see why not. Perhaps try it and see how you go.

Jason would it be real strange to train an MLP NN (not shuffled data) on a time series dataset with a lot of dummy variables that represent day-of-week & time-of-day, along with an outside air temperature sensor reading (electricity power dataset). Save the model in a pickel file…

If I only have 1 years data, could I ever inverse transform the training dataset to test the model in like a block chain format with calculated dummy variables?

2 questions sorry!

Hard to say, perhaps try it and see.

Great tutorial as always. I noticed that you’re not using xgboost’s early stopping feature – where it compares the training performance to a test set, and stops training more trees if the performance has flattened out.

Is this because you’re unable to alter n_estimators for every step that you walk forward? Basically you’d need to manually set (or do a GridSearchCV) to find the best n_estimators across all walk-forward steps?

Thanks.

Early stopping is hard to do with time series data, e.g. you cannot reasonably define a validation set.

It might be simpler to grid search different numbers of trees on the walk-forward validation test harness.