# A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python

Last Updated on

Exponential smoothing is a time series forecasting method for univariate data that can be extended to support data with a systematic trend or seasonal component.

It is a powerful forecasting method that may be used as an alternative to the popular Box-Jenkins ARIMA family of methods.

In this tutorial, you will discover the exponential smoothing method for univariate time series forecasting.

After completing this tutorial, you will know:

• What exponential smoothing is and how it is different from other forecasting methods.
• The three main types of exponential smoothing and how to configure them.
• How to implement exponential smoothing in Python.

Discover how to prepare and visualize time series data and develop autoregressive forecasting models in my new book, with 28 step-by-step tutorials, and full python code.

Let’s get started. A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python
Photo by Wolfgang Staudt, some rights reserved.

## Tutorial Overview

This tutorial is divided into 4 parts; they are:

1. What Is Exponential Smoothing?
2. Types of Exponential Smoothing
3. How to Configure Exponential Smoothing
4. Exponential Smoothing in Python

## What Is Exponential Smoothing?

Exponential smoothing is a time series forecasting method for univariate data.

Time series methods like the Box-Jenkins ARIMA family of methods develop a model where the prediction is a weighted linear sum of recent past observations or lags.

Exponential smoothing forecasting methods are similar in that a prediction is a weighted sum of past observations, but the model explicitly uses an exponentially decreasing weight for past observations.

Specifically, past observations are weighted with a geometrically decreasing ratio.

Forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older. In other words, the more recent the observation the higher the associated weight.

— Page 171, Forecasting: principles and practice, 2013.

Exponential smoothing methods may be considered as peers and an alternative to the popular Box-Jenkins ARIMA class of methods for time series forecasting.

Collectively, the methods are sometimes referred to as ETS models, referring to the explicit modeling of Error, Trend and Seasonality.

## Types of Exponential Smoothing

There are three main types of exponential smoothing time series forecasting methods.

A simple method that assumes no systematic structure, an extension that explicitly handles trends, and the most advanced approach that add support for seasonality.

### Single Exponential Smoothing

Single Exponential Smoothing, SES for short, also called Simple Exponential Smoothing, is a time series forecasting method for univariate data without a trend or seasonality.

It requires a single parameter, called alpha (a), also called the smoothing factor or smoothing coefficient.

This parameter controls the rate at which the influence of the observations at prior time steps decay exponentially. Alpha is often set to a value between 0 and 1. Large values mean that the model pays attention mainly to the most recent past observations, whereas smaller values mean more of the history is taken into account when making a prediction.

A value close to 1 indicates fast learning (that is, only the most recent values influence the forecasts), whereas a value close to 0 indicates slow learning (past observations have a large influence on forecasts).

— Page 89, Practical Time Series Forecasting with R, 2016.

Hyperparameters:

• Alpha: Smoothing factor for the level.

### Double Exponential Smoothing

Double Exponential Smoothing is an extension to Exponential Smoothing that explicitly adds support for trends in the univariate time series.

In addition to the alpha parameter for controlling smoothing factor for the level, an additional smoothing factor is added to control the decay of the influence of the change in trend called beta (b).

The method supports trends that change in different ways: an additive and a multiplicative, depending on whether the trend is linear or exponential respectively.

Double Exponential Smoothing with an additive trend is classically referred to as Holt’s linear trend model, named for the developer of the method Charles Holt.

• Additive Trend: Double Exponential Smoothing with a linear trend.
• Multiplicative Trend: Double Exponential Smoothing with an exponential trend.

For longer range (multi-step) forecasts, the trend may continue on unrealistically. As such, it can be useful to dampen the trend over time.

Dampening means reducing the size of the trend over future time steps down to a straight line (no trend).

The forecasts generated by Holt’s linear method display a constant trend (increasing or decreasing) indecently into the future. Even more extreme are the forecasts generated by the exponential trend method […] Motivated by this observation […] introduced a parameter that “dampens” the trend to a flat line some time in the future.

— Page 183, Forecasting: principles and practice, 2013.

As with modeling the trend itself, we can use the same principles in dampening the trend, specifically additively or multiplicatively for a linear or exponential dampening effect. A damping coefficient Phi (p) is used to control the rate of dampening.

• Additive Dampening: Dampen a trend linearly.
• Multiplicative Dampening: Dampen the trend exponentially.

Hyperparameters:

• Alpha: Smoothing factor for the level.
• Beta: Smoothing factor for the trend.
• Trend Type: Additive or multiplicative.
• Dampen Type: Additive or multiplicative.
• Phi: Damping coefficient.

### Triple Exponential Smoothing

Triple Exponential Smoothing is an extension of Exponential Smoothing that explicitly adds support for seasonality to the univariate time series.

This method is sometimes called Holt-Winters Exponential Smoothing, named for two contributors to the method: Charles Holt and Peter Winters.

In addition to the alpha and beta smoothing factors, a new parameter is added called gamma (g) that controls the influence on the seasonal component.

As with the trend, the seasonality may be modeled as either an additive or multiplicative process for a linear or exponential change in the seasonality.

• Additive Seasonality: Triple Exponential Smoothing with a linear seasonality.
• Multiplicative Seasonality: Triple Exponential Smoothing with an exponential seasonality.

Triple exponential smoothing is the most advanced variation of exponential smoothing and through configuration, it can also develop double and single exponential smoothing models.

Being an adaptive method, Holt-Winter’s exponential smoothing allows the level, trend and seasonality patterns to change over time.

— Page 95, Practical Time Series Forecasting with R, 2016.

Additionally, to ensure that the seasonality is modeled correctly, the number of time steps in a seasonal period (Period) must be specified. For example, if the series was monthly data and the seasonal period repeated each year, then the Period=12.

Hyperparameters:

• Alpha: Smoothing factor for the level.
• Beta: Smoothing factor for the trend.
• Gamma: Smoothing factor for the seasonality.
• Trend Type: Additive or multiplicative.
• Dampen Type: Additive or multiplicative.
• Phi: Damping coefficient.
• Seasonality Type: Additive or multiplicative.
• Period: Time steps in seasonal period.

## How to Configure Exponential Smoothing

All of the model hyperparameters can be specified explicitly.

This can be challenging for experts and beginners alike.

Instead, it is common to use numerical optimization to search for and fund the smoothing coefficients (alpha, beta, gamma, and phi) for the model that result in the lowest error.

[…] a more robust and objective way to obtain values for the unknown parameters included in any exponential smoothing method is to estimate them from the observed data. […] the unknown parameters and the initial values for any exponential smoothing method can be estimated by minimizing the SSE [sum of the squared errors].

— Page 177, Forecasting: principles and practice, 2013.

The parameters that specify the type of change in the trend and seasonality, such as weather they are additive or multiplicative and whether they should be dampened, must be specified explicitly.

## Exponential Smoothing in Python

This section looks at how to implement exponential smoothing in Python.

The implementations of Exponential Smoothing in Python are provided in the Statsmodels Python library.

The implementations are based on the description of the method in Rob Hyndman and George Athana­sopou­los’ excellent book “Forecasting: Principles and Practice,” 2013 and their R implementations in their “forecast” package.

### Single Exponential Smoothing

Single Exponential Smoothing or simple smoothing can be implemented in Python via the SimpleExpSmoothing Statsmodels class.

First, an instance of the SimpleExpSmoothing class must be instantiated and passed the training data. The fit() function is then called providing the fit configuration, specifically the alpha value called smoothing_level. If this is not provided or set to None, the model will automatically optimize the value.

This fit() function returns an instance of the HoltWintersResults class that contains the learned coefficients. The forecast() or the predict() function on the result object can be called to make a forecast.

For example:

### Double and Triple Exponential Smoothing

Single, Double and Triple Exponential Smoothing can be implemented in Python using the ExponentialSmoothing Statsmodels class.

First, an instance of the ExponentialSmoothing class must be instantiated, specifying both the training data and some configuration for the model.

Specifically, you must specify the following configuration parameters:

• trend: The type of trend component, as either “add” for additive or “mul” for multiplicative. Modeling the trend can be disabled by setting it to None.
• damped: Whether or not the trend component should be damped, either True or False.
• seasonal: The type of seasonal component, as either “add” for additive or “mul” for multiplicative. Modeling the seasonal component can be disabled by setting it to None.
• seasonal_periods: The number of time steps in a seasonal period, e.g. 12 for 12 months in a yearly seasonal structure (more here).

The model can then be fit on the training data by calling the fit() function.

This function allows you to either specify the smoothing coefficients of the exponential smoothing model or have them optimized. By default, they are optimized (e.g. optimized=True). These coefficients include:

• smoothing_level (alpha): the smoothing coefficient for the level.
• smoothing_slope (beta): the smoothing coefficient for the trend.
• smoothing_seasonal (gamma): the smoothing coefficient for the seasonal component.
• damping_slope (phi): the coefficient for the damped trend.

Additionally, the fit function can perform basic data preparation prior to modeling; specifically:

• use_boxcox: Whether or not to perform a power transform of the series (True/False) or specify the lambda for the transform.

The fit() function will return an instance of the HoltWintersResults class that contains the learned coefficients. The forecast() or the predict() function on the result object can be called to make a forecast.

This section provides more resources on the topic if you are looking to go deeper.

Posts

### Summary

In this tutorial, you discovered the exponential smoothing method for univariate time series forecasting.

Specifically, you learned:

• What exponential smoothing is and how it is different from other forecast methods.
• The three main types of exponential smoothing and how to configure them.
• How to implement exponential smoothing in Python.

Do you have any questions?

## Want to Develop Time Series Forecasts with Python? #### Develop Your Own Forecasts in Minutes

...with just a few lines of python code

Discover how in my new Ebook:
Introduction to Time Series Forecasting With Python

It covers self-study tutorials and end-to-end projects on topics like: Loading data, visualization, modeling, algorithm tuning, and much more...

### 38 Responses to A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python

1. Sandeep August 20, 2018 at 10:18 pm #

Hi Jason,

Thank you very much for your post. This is very helpful resources. I would like to know how to install “statsmodels.tsa.holtwinters” as I see that it is throwing error when I ran the command :
from statsmodels.tsa.holtwinters import ExponentialSmoothing

It seems that statsmodels package do not have that command.

ThanK you,
Sandeep

• Jason Brownlee August 21, 2018 at 6:16 am #

It really depends on your platform, for example:

Alternately, try this tutorial:
https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/

• Ehsan E Nasiri August 23, 2018 at 2:15 am #

Hello Jason,

I am working on a forecasting project with a big dataset which includes 15 columns and around 9000 rows. The problem is I have to forecast the result for the next two years base on 14 columns of independent data, and the result should be binary(0,1).
I saw many forecasting problems online, but most of them forecast base on just one column of independent data with no binary result.
Is there any way to guide me or refer me any references to solve the problem?

Ehsan

• Jason Brownlee August 23, 2018 at 6:15 am #

Yes, a neural network can easily forecast multiple variables, perhaps start with an MLP.

• Satakarni October 30, 2018 at 6:56 pm #

Hi Jason

I am finding different results for DES method in R and python. Is Python ETS not a complete implementation as described in Hyndman et al (2008)? R ETS method have way too many flags to control? Kindly clarify
————–

R-Code

——————
> x

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

1 36 78 35 244 25 283 42 6 59 5 47 20

2 0 0 5 38 16 143 14 37 60 2 55 0

3 0

> fit forecast::forecast(fit, h=1)

Point Forecast Lo 80 Hi 80 Lo 95 Hi 95

Feb 3 -2.728456 -96.36635 90.90943 -145.9353 140.4783

>

———————
Python
———————-

dft
Out:
quantity
month_end
2016-01-31 36
2016-02-29 78
2016-03-31 35
2016-04-30 244
2016-05-31 25
2016-06-30 283
2016-07-31 42
2016-08-31 6
2016-09-30 59
2016-10-31 5
2016-11-30 47
2016-12-31 20
2017-01-31 0
2017-02-28 0
2017-03-31 5
2017-04-30 38
2017-05-31 16
2017-06-30 143
2017-07-31 14
2017-08-31 37
2017-09-30 60
2017-10-31 2
2017-11-30 55
2017-12-31 0
holt_r = ets.ExponentialSmoothing(np.abs(dft), trend=’additive’, damped=False, seasonal=None).fit()
C:\Anaconda\lib\site-packages\statsmodels\tsa\base\tsa_model.py:171: ValueWarning: No frequency information was provided, so inferred frequency M will be used.
% freq, ValueWarning)

holt_r.forecast(1)
Out:
2018-01-31 13.049129
Freq: M, dtype: float64

• Jason Brownlee October 31, 2018 at 6:24 am #

Sorry, I don’t know about the R implementation of ETS.

2. Elie Kawerk August 20, 2018 at 11:52 pm #

Hi Jason,

Hyndman has published a new edition of ‘Forecasting, principles and practice’. It is available free of charge at: https://otexts.org/fpp2/ .

Best,
Elie

• Jason Brownlee August 21, 2018 at 6:17 am #

Thanks.

• Maddy January 2, 2019 at 6:37 am #

Thanks for sharing the link of the book.

• Jason Brownlee January 2, 2019 at 6:45 am #

It is a fantastic book!

3. Karmen August 21, 2018 at 2:25 pm #

Thanks for this – clear, and gentle, with nice follow up resources!

• Jason Brownlee August 22, 2018 at 6:07 am #

You’re welcome!

4. Sandeep August 24, 2018 at 9:16 am #

Thanks for really nice and helpful matter on exponential smoothing.

• Jason Brownlee August 24, 2018 at 2:09 pm #

Thanks!

5. Ahmed Elshami October 6, 2018 at 9:52 am #

Hi Dr. Jason,

Let’s assume I smoothed my whole time series data, then I fit the model and did my prediction.
my question is, should I unsmooth my prediction or not to calculate error?

• Jason Brownlee October 6, 2018 at 11:44 am #

If your goal of smoothing was to make the problem easier to learn, then no change is required.

What would un-smoothing look like exactly? The addition of random noise?

6. Alice December 21, 2018 at 2:23 am #

Hi Jason

I have some questions about possible methods for sequential prediction.

Input y_0=100, y_1=y_0*0.96, y_2=y_1*0.97=y_0*0.96*0.97, y_3=y_2*0.978=y_0*0.96*0.97*0.978

Predict y_k

It looks like that y_k has a dynamic decay factor for the exponential function.
y_k=y_0*((D_k)^(k))

If I use the average rate of change in 0.96,0.97.0.978
then y_k=y_0*(0.96^k)*(((0.97/0.96)+(0.978/0.97))/2)^(1+2+3..k) =y_0*(0.96^k)*(1.009)^(k*(k+1)/2)

then log(y_k)=a+b*k+c*(k^2).

Should I use Triple Exponential Smoothing or LSTM to predict y_k? or is there any other possible methods?

If the input y_0, y_1, y_2 are uncertain. e.g y_0=100,101or 103 y_1=100*0.963, 101*0.964or 103*0.966. Which method should I use to predict y_k (only one value)?

• Jason Brownlee December 21, 2018 at 5:30 am #

Perhaps try a range of methods and discover what works best for your specific dataset.

7. Mridul March 8, 2019 at 4:23 am #

Hi Jason, thanks for this. This is really helpful.

Could you also touch upon Brown’s double or LES model in python?

• Jason Brownlee March 8, 2019 at 7:56 am #

Thanks for the suggestion.

8. Sheetal April 8, 2019 at 3:50 am #

Hi Jason,

When I use statsmodel to run SimpleExpSmoothing and Holtwinters model, I am getting below error.

AttributeError: ‘Holt’ object has no attribute ‘_get_prediction_index’

Here’s my code :
``` from statsmodels.tsa.holtwinters import ExponentialSmoothing,Holt,SimpleExpSmoothing fit2 = SimpleExpSmoothing(np.asarray(Train['Count'])) fit2._index = pd.to_datetime(Train.index) pred = fit2.fit() y_hat_avg['SES'] = pred.forecast(len(valid))```

``` ```

Thank you!

• Jason Brownlee April 8, 2019 at 5:58 am #

Sorry to hear that, are you able to confirm that your version of statsmodels is up to date?

9. Jem94 June 29, 2019 at 6:26 pm #

Hi, Jason,

I would like to ask you if there was an iterative function that every time a new data arrives does not require the recalculation of the ExponentialSmoothing, but just add the new data (update the model)?

• Jason Brownlee June 30, 2019 at 9:35 am #

Good question.

I believe you might have to implement it yourself.

10. Amit July 7, 2019 at 8:40 pm #

As beginner in this analytics world, How to get familiar with Statistic terminology ? How best i can make myself comfortable with statistic terminology ?

11. DataNoob2020 August 23, 2019 at 4:11 am #

Hello Jason!
I see you use ExponentialSmoothing in your example to implement Double Exponential. Can you explain why you did not use the Holt api from statsmodels? I have included the link to the model I am referring to. https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.Holt.html

It looks like both can be used for double. Just wondering why you chose one over the other. Thanks!

• Jason Brownlee August 23, 2019 at 6:34 am #

No big reason, I was going for consistency in the examples.

Do you prefer one over the other? If so, why?

• DataNoob2020 August 23, 2019 at 6:48 am #

Thanks for the reply! I am a noob when it comes to forecasting and only taught myself Python a year and a half ago.

I was using your method and then gave the Holt method a try and it ended up being a disaster in my opinion. Large variances in results when comparing to ExponentialSmoothing with seasonality turned off.

I don’t want to hijack this conversation, but I have a question about holdout forecasts if you don’t mind. I saw in one of your guides that you calculated rmse on actual vs predicted but I believe you only did it for one period. I am currently doing a 6 month hold out forecast and was originally just running my model like:

# model2 = ExponentialSmoothing(data[:-6], trend=’add’, seasonal=None, damped=False).fit(smoothing_level=0.1, smoothing_slope=0.1, optimized=False)
# fcast2 = model2.forecast(6)

I would then calculate the rmse using the forecasting vs actual values. I was told that this was not best practices as I should be doing the hold out forecast one period at a time. Essentially I would do model2.forecast(1) at data[:-6] and then model3.forecast(1) at data[:-5] and so on and so forth.

If you do not mind, I would appreciate your wisdom!

• Jason Brownlee August 23, 2019 at 2:07 pm #

Good question.

It comes down to how you want to use the model, to the define how you want to evaluate it.

e.g. is it one step predictions that are most important, then evaluate skill on that. if it is n-step, then you might want to know the average error made on each step over multiple forecasts.

This post may give you some ideas related to walk-forward validation:
https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/

• DataNoob2020 August 23, 2019 at 11:45 pm #

Thanks! That article was great. It appears the walk-forward validation is the way to go, though running all those DoubleExpos drastically increases the amount of time it takes to run. I am thinking I need to rewrite my DoubleExpo function to use multiprocessing or multithreading.

Do you accept bitcoin donations? Your website has been extremely helpful in my forecasting quest.

• Jason Brownlee August 24, 2019 at 7:52 am #

Nice, yes a custom implementation built for speed would be my path too.

Thanks!

I accept paypal donations, if that is still a thing:
https://machinelearningmastery.com/support/

12. edgar panganiban October 22, 2019 at 2:29 pm #

I want to have a one-step forecast using the following codes

from statsmodels.tsa.holtwinters import HoltWintersResults

model_fit_se = HoltWintersResults.initialize(‘model_se.pkl’,smoothing_level=0.8,smoothing_slope=0.2,optimized=False)
yhat = model_fit_se.forecast()
print(‘Predicted: %.3f’ % yhat)

but I got this error:

TypeError: initialize() missing 2 required positional arguments: ‘model’ and ‘params’

I think its in the parameters parts….how do I fix this

• Jason Brownlee October 23, 2019 at 6:28 am #

I don’t have good advice sorry, perhaps try posting your code and error to stackoverflow?

13. Kenny Shu October 31, 2019 at 12:45 pm #

Hi Jason,

Very Intuitive post! I am wondering if you know how to manipulate the optimal criteria regarding time windows. For example, I want to select a model that optimizes the sum of MSE of the next 12 period data instead of just the next period.

How could I achieve that based on your model?

• Jason Brownlee October 31, 2019 at 1:38 pm #

Thanks.

Yes, you can try a grid search and run your own evaluation on predictions via walk-forward validation.

• Kenny Shu October 31, 2019 at 9:32 pm #

Thanks Jason! I believe that post is a lifesaver for people who are struggling with finding a python function that is equivalent to Hyndman’ ETS function in R (Please correct me if I am wrong)

Just want to make sure that I understand this method correctly:

If I were to minimize the sum of next 12 period’s rmse, should I just make some changes in the function _walk_forward_validation_ to ensure it returns the sum of next 12 period’s rmse?

• Jason Brownlee November 1, 2019 at 5:29 am #

Thanks.

Yes.