Last Updated on

Exponential smoothing is a time series forecasting method for univariate data that can be extended to support data with a systematic trend or seasonal component.

It is a powerful forecasting method that may be used as an alternative to the popular Box-Jenkins ARIMA family of methods.

In this tutorial, you will discover the exponential smoothing method for univariate time series forecasting.

After completing this tutorial, you will know:

- What exponential smoothing is and how it is different from other forecasting methods.
- The three main types of exponential smoothing and how to configure them.
- How to implement exponential smoothing in Python.

Discover how to prepare and visualize time series data and develop autoregressive forecasting models in my new book, with 28 step-by-step tutorials, and full python code.

Let’s get started.

## Tutorial Overview

This tutorial is divided into 4 parts; they are:

- What Is Exponential Smoothing?
- Types of Exponential Smoothing
- How to Configure Exponential Smoothing
- Exponential Smoothing in Python

## What Is Exponential Smoothing?

Exponential smoothing is a time series forecasting method for univariate data.

Time series methods like the Box-Jenkins ARIMA family of methods develop a model where the prediction is a weighted linear sum of recent past observations or lags.

Exponential smoothing forecasting methods are similar in that a prediction is a weighted sum of past observations, but the model explicitly uses an exponentially decreasing weight for past observations.

Specifically, past observations are weighted with a geometrically decreasing ratio.

Forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older. In other words, the more recent the observation the higher the associated weight.

— Page 171, Forecasting: principles and practice, 2013.

Exponential smoothing methods may be considered as peers and an alternative to the popular Box-Jenkins ARIMA class of methods for time series forecasting.

Collectively, the methods are sometimes referred to as ETS models, referring to the explicit modeling of Error, Trend and Seasonality.

## Types of Exponential Smoothing

There are three main types of exponential smoothing time series forecasting methods.

A simple method that assumes no systematic structure, an extension that explicitly handles trends, and the most advanced approach that add support for seasonality.

### Single Exponential Smoothing

Single Exponential Smoothing, SES for short, also called Simple Exponential Smoothing, is a time series forecasting method for univariate data without a trend or seasonality.

It requires a single parameter, called *alpha* (*a*), also called the smoothing factor or smoothing coefficient.

This parameter controls the rate at which the influence of the observations at prior time steps decay exponentially. Alpha is often set to a value between 0 and 1. Large values mean that the model pays attention mainly to the most recent past observations, whereas smaller values mean more of the history is taken into account when making a prediction.

A value close to 1 indicates fast learning (that is, only the most recent values influence the forecasts), whereas a value close to 0 indicates slow learning (past observations have a large influence on forecasts).

— Page 89, Practical Time Series Forecasting with R, 2016.

Hyperparameters:

**Alpha**: Smoothing factor for the level.

### Double Exponential Smoothing

Double Exponential Smoothing is an extension to Exponential Smoothing that explicitly adds support for trends in the univariate time series.

In addition to the *alpha* parameter for controlling smoothing factor for the level, an additional smoothing factor is added to control the decay of the influence of the change in trend called *beta* (*b*).

The method supports trends that change in different ways: an additive and a multiplicative, depending on whether the trend is linear or exponential respectively.

Double Exponential Smoothing with an additive trend is classically referred to as Holt’s linear trend model, named for the developer of the method Charles Holt.

**Additive Trend**: Double Exponential Smoothing with a linear trend.**Multiplicative Trend**: Double Exponential Smoothing with an exponential trend.

For longer range (multi-step) forecasts, the trend may continue on unrealistically. As such, it can be useful to dampen the trend over time.

Dampening means reducing the size of the trend over future time steps down to a straight line (no trend).

The forecasts generated by Holt’s linear method display a constant trend (increasing or decreasing) indecently into the future. Even more extreme are the forecasts generated by the exponential trend method […] Motivated by this observation […] introduced a parameter that “dampens” the trend to a flat line some time in the future.

— Page 183, Forecasting: principles and practice, 2013.

As with modeling the trend itself, we can use the same principles in dampening the trend, specifically additively or multiplicatively for a linear or exponential dampening effect. A damping coefficient *Phi* (*p*) is used to control the rate of dampening.

**Additive Dampening**: Dampen a trend linearly.**Multiplicative Dampening**: Dampen the trend exponentially.

Hyperparameters:

**Alpha**: Smoothing factor for the level.**Beta**: Smoothing factor for the trend.**Trend Type**: Additive or multiplicative.**Dampen Type**: Additive or multiplicative.**Phi**: Damping coefficient.

### Triple Exponential Smoothing

Triple Exponential Smoothing is an extension of Exponential Smoothing that explicitly adds support for seasonality to the univariate time series.

This method is sometimes called Holt-Winters Exponential Smoothing, named for two contributors to the method: Charles Holt and Peter Winters.

In addition to the alpha and beta smoothing factors, a new parameter is added called *gamma* (*g*) that controls the influence on the seasonal component.

As with the trend, the seasonality may be modeled as either an additive or multiplicative process for a linear or exponential change in the seasonality.

**Additive Seasonality**: Triple Exponential Smoothing with a linear seasonality.**Multiplicative Seasonality**: Triple Exponential Smoothing with an exponential seasonality.

Triple exponential smoothing is the most advanced variation of exponential smoothing and through configuration, it can also develop double and single exponential smoothing models.

Being an adaptive method, Holt-Winter’s exponential smoothing allows the level, trend and seasonality patterns to change over time.

— Page 95, Practical Time Series Forecasting with R, 2016.

Additionally, to ensure that the seasonality is modeled correctly, the number of time steps in a seasonal period (*Period*) must be specified. For example, if the series was monthly data and the seasonal period repeated each year, then the Period=12.

Hyperparameters:

**Alpha**: Smoothing factor for the level.**Beta**: Smoothing factor for the trend.**Gamma**: Smoothing factor for the seasonality.**Trend Type**: Additive or multiplicative.**Dampen Type**: Additive or multiplicative.**Phi**: Damping coefficient.**Seasonality Type**: Additive or multiplicative.**Period**: Time steps in seasonal period.

## How to Configure Exponential Smoothing

All of the model hyperparameters can be specified explicitly.

This can be challenging for experts and beginners alike.

Instead, it is common to use numerical optimization to search for and fund the smoothing coefficients (*alpha*, *beta*, *gamma*, and *phi*) for the model that result in the lowest error.

[…] a more robust and objective way to obtain values for the unknown parameters included in any exponential smoothing method is to estimate them from the observed data. […] the unknown parameters and the initial values for any exponential smoothing method can be estimated by minimizing the SSE [sum of the squared errors].

— Page 177, Forecasting: principles and practice, 2013.

The parameters that specify the type of change in the trend and seasonality, such as weather they are additive or multiplicative and whether they should be dampened, must be specified explicitly.

## Exponential Smoothing in Python

This section looks at how to implement exponential smoothing in Python.

The implementations of Exponential Smoothing in Python are provided in the Statsmodels Python library.

The implementations are based on the description of the method in Rob Hyndman and George Athanasopoulos’ excellent book “Forecasting: Principles and Practice,” 2013 and their R implementations in their “forecast” package.

### Single Exponential Smoothing

Single Exponential Smoothing or simple smoothing can be implemented in Python via the SimpleExpSmoothing Statsmodels class.

First, an instance of the *SimpleExpSmoothing* class must be instantiated and passed the training data. The *fit()* function is then called providing the fit configuration, specifically the *alpha* value called *smoothing_level*. If this is not provided or set to *None*, the model will automatically optimize the value.

This *fit()* function returns an instance of the *HoltWintersResults* class that contains the learned coefficients. The *forecast()* or the *predict()* function on the result object can be called to make a forecast.

For example:

1 2 3 4 5 6 7 8 9 10 11 |
# single exponential smoothing ... from statsmodels.tsa.holtwinters import SimpleExpSmoothing # prepare data data = ... # create class model = SimpleExpSmoothing(data) # fit model model_fit = model.fit(...) # make prediction yhat = model_fit.predict(...) |

### Double and Triple Exponential Smoothing

Single, Double and Triple Exponential Smoothing can be implemented in Python using the ExponentialSmoothing Statsmodels class.

First, an instance of the ExponentialSmoothing class must be instantiated, specifying both the training data and some configuration for the model.

Specifically, you must specify the following configuration parameters:

**trend**: The type of trend component, as either “*add*” for additive or “*mul*” for multiplicative. Modeling the trend can be disabled by setting it to None.**damped**: Whether or not the trend component should be damped, either*True*or*False*.**seasonal**: The type of seasonal component, as either “*add*” for additive or “*mul*” for multiplicative. Modeling the seasonal component can be disabled by setting it to None.**seasonal_periods**: The number of time steps in a seasonal period, e.g. 12 for 12 months in a yearly seasonal structure (more here).

The model can then be fit on the training data by calling the *fit()* function.

This function allows you to either specify the smoothing coefficients of the exponential smoothing model or have them optimized. By default, they are optimized (e.g. *optimized=True*). These coefficients include:

**smoothing_level**(*alpha*): the smoothing coefficient for the level.**smoothing_slope**(*beta*): the smoothing coefficient for the trend.**smoothing_seasonal**(*gamma*): the smoothing coefficient for the seasonal component.**damping_slope**(*phi*): the coefficient for the damped trend.

Additionally, the fit function can perform basic data preparation prior to modeling; specifically:

**use_boxcox**: Whether or not to perform a power transform of the series (True/False) or specify the lambda for the transform.

The *fit()* function will return an instance of the *HoltWintersResults* class that contains the learned coefficients. The *forecast()* or the *predict()* function on the result object can be called to make a forecast.

1 2 3 4 5 6 7 8 9 10 11 |
# double or triple exponential smoothing ... from statsmodels.tsa.holtwinters import ExponentialSmoothing # prepare data data = ... # create class model = ExponentialSmoothing(data, ...) # fit model model_fit = model.fit(...) # make prediction yhat = model_fit.predict(...) |

## Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Posts

### Books

- Chapter 7 Exponential smoothing, Forecasting: principles and practice, 2013.
- Section 6.4. Introduction to Time Series Analysis, Engineering Statistics Handbook, 2012.
- Practical Time Series Forecasting with R, 2016.

### API

- Statsmodels Time Series analysis tsa
- statsmodels.tsa.holtwinters.SimpleExpSmoothing API
- statsmodels.tsa.holtwinters.ExponentialSmoothing API
- statsmodels.tsa.holtwinters.HoltWintersResults API
- forecast: Forecasting Functions for Time Series and Linear Models R package

### Articles

### Summary

In this tutorial, you discovered the exponential smoothing method for univariate time series forecasting.

Specifically, you learned:

- What exponential smoothing is and how it is different from other forecast methods.
- The three main types of exponential smoothing and how to configure them.
- How to implement exponential smoothing in Python.

Do you have any questions?

Ask your questions in the comments below and I will do my best to answer.

Hi Jason,

Thank you very much for your post. This is very helpful resources. I would like to know how to install “statsmodels.tsa.holtwinters” as I see that it is throwing error when I ran the command :

from statsmodels.tsa.holtwinters import ExponentialSmoothing

It seems that statsmodels package do not have that command.

Could you please help me in working that command?

ThanK you,

Sandeep

It really depends on your platform, for example:

Alternately, try this tutorial:

https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/

Hello Jason,

I am working on a forecasting project with a big dataset which includes 15 columns and around 9000 rows. The problem is I have to forecast the result for the next two years base on 14 columns of independent data, and the result should be binary(0,1).

I saw many forecasting problems online, but most of them forecast base on just one column of independent data with no binary result.

Is there any way to guide me or refer me any references to solve the problem?

Thank you in advance,

Ehsan

Yes, a neural network can easily forecast multiple variables, perhaps start with an MLP.

Hi Jason

Thanks for your post.

I am finding different results for DES method in R and python. Is Python ETS not a complete implementation as described in Hyndman et al (2008)? R ETS method have way too many flags to control? Kindly clarify

————–

R-Code

——————

> x

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

1 36 78 35 244 25 283 42 6 59 5 47 20

2 0 0 5 38 16 143 14 37 60 2 55 0

3 0

> fit forecast::forecast(fit, h=1)

Point Forecast Lo 80 Hi 80 Lo 95 Hi 95

Feb 3 -2.728456 -96.36635 90.90943 -145.9353 140.4783

>

———————

Python

———————-

dft

Out[42]:

quantity

month_end

2016-01-31 36

2016-02-29 78

2016-03-31 35

2016-04-30 244

2016-05-31 25

2016-06-30 283

2016-07-31 42

2016-08-31 6

2016-09-30 59

2016-10-31 5

2016-11-30 47

2016-12-31 20

2017-01-31 0

2017-02-28 0

2017-03-31 5

2017-04-30 38

2017-05-31 16

2017-06-30 143

2017-07-31 14

2017-08-31 37

2017-09-30 60

2017-10-31 2

2017-11-30 55

2017-12-31 0

holt_r = ets.ExponentialSmoothing(np.abs(dft), trend=’additive’, damped=False, seasonal=None).fit()

C:\Anaconda\lib\site-packages\statsmodels\tsa\base\tsa_model.py:171: ValueWarning: No frequency information was provided, so inferred frequency M will be used.

% freq, ValueWarning)

holt_r.forecast(1)

Out[44]:

2018-01-31 13.049129

Freq: M, dtype: float64

Sorry, I don’t know about the R implementation of ETS.

Hi Jason,

Hyndman has published a new edition of ‘Forecasting, principles and practice’. It is available free of charge at: https://otexts.org/fpp2/ .

Best,

Elie

Thanks.

Thanks for sharing the link of the book.

It is a fantastic book!

Thanks for this – clear, and gentle, with nice follow up resources!

You’re welcome!

Thanks for really nice and helpful matter on exponential smoothing.

Thanks!

Hi Dr. Jason,

I was wondering about something.

Let’s assume I smoothed my whole time series data, then I fit the model and did my prediction.

my question is, should I unsmooth my prediction or not to calculate error?

If your goal of smoothing was to make the problem easier to learn, then no change is required.

What would un-smoothing look like exactly? The addition of random noise?

Hi Jason

I have some questions about possible methods for sequential prediction.

Input y_0=100, y_1=y_0*0.96, y_2=y_1*0.97=y_0*0.96*0.97, y_3=y_2*0.978=y_0*0.96*0.97*0.978

Predict y_k

It looks like that y_k has a dynamic decay factor for the exponential function.

y_k=y_0*((D_k)^(k))

If I use the average rate of change in 0.96,0.97.0.978

then y_k=y_0*(0.96^k)*(((0.97/0.96)+(0.978/0.97))/2)^(1+2+3..k) =y_0*(0.96^k)*(1.009)^(k*(k+1)/2)

then log(y_k)=a+b*k+c*(k^2).

Should I use Triple Exponential Smoothing or LSTM to predict y_k? or is there any other possible methods?

If the input y_0, y_1, y_2 are uncertain. e.g y_0=100,101or 103 y_1=100*0.963, 101*0.964or 103*0.966. Which method should I use to predict y_k (only one value)?

Perhaps try a range of methods and discover what works best for your specific dataset.

Hi Jason, thanks for this. This is really helpful.

Could you also touch upon Brown’s double or LES model in python?

Thanks for the suggestion.

Hi Jason,

When I use statsmodel to run SimpleExpSmoothing and Holtwinters model, I am getting below error.

AttributeError: ‘Holt’ object has no attribute ‘_get_prediction_index’

Here’s my code :

from statsmodels.tsa.holtwinters import ExponentialSmoothing,Holt,SimpleExpSmoothing

fit2 = SimpleExpSmoothing(np.asarray(Train['Count']))

fit2._index = pd.to_datetime(Train.index)

pred = fit2.fit()

y_hat_avg['SES'] = pred.forecast(len(valid))

Thank you!

Sorry to hear that, are you able to confirm that your version of statsmodels is up to date?

Hi, Jason,

I would like to ask you if there was an iterative function that every time a new data arrives does not require the recalculation of the ExponentialSmoothing, but just add the new data (update the model)?

Good question.

I believe you might have to implement it yourself.

As beginner in this analytics world, How to get familiar with Statistic terminology ? How best i can make myself comfortable with statistic terminology ?

Perhaps start here:

https://machinelearningmastery.com/start-here/#statistical_methods

Hello Jason!

I see you use ExponentialSmoothing in your example to implement Double Exponential. Can you explain why you did not use the Holt api from statsmodels? I have included the link to the model I am referring to. https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.Holt.html

It looks like both can be used for double. Just wondering why you chose one over the other. Thanks!

No big reason, I was going for consistency in the examples.

Do you prefer one over the other? If so, why?

Thanks for the reply! I am a noob when it comes to forecasting and only taught myself Python a year and a half ago.

I was using your method and then gave the Holt method a try and it ended up being a disaster in my opinion. Large variances in results when comparing to ExponentialSmoothing with seasonality turned off.

I don’t want to hijack this conversation, but I have a question about holdout forecasts if you don’t mind. I saw in one of your guides that you calculated rmse on actual vs predicted but I believe you only did it for one period. I am currently doing a 6 month hold out forecast and was originally just running my model like:

# model2 = ExponentialSmoothing(data[:-6], trend=’add’, seasonal=None, damped=False).fit(smoothing_level=0.1, smoothing_slope=0.1, optimized=False)

# fcast2 = model2.forecast(6)

I would then calculate the rmse using the forecasting vs actual values. I was told that this was not best practices as I should be doing the hold out forecast one period at a time. Essentially I would do model2.forecast(1) at data[:-6] and then model3.forecast(1) at data[:-5] and so on and so forth.

If you do not mind, I would appreciate your wisdom!

Good question.

It comes down to how you want to use the model, to the define how you want to evaluate it.

e.g. is it one step predictions that are most important, then evaluate skill on that. if it is n-step, then you might want to know the average error made on each step over multiple forecasts.

This post may give you some ideas related to walk-forward validation:

https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/

Thanks! That article was great. It appears the walk-forward validation is the way to go, though running all those DoubleExpos drastically increases the amount of time it takes to run. I am thinking I need to rewrite my DoubleExpo function to use multiprocessing or multithreading.

Do you accept bitcoin donations? Your website has been extremely helpful in my forecasting quest.

Nice, yes a custom implementation built for speed would be my path too.

Thanks!

I accept paypal donations, if that is still a thing:

https://machinelearningmastery.com/support/

I want to have a one-step forecast using the following codes

from statsmodels.tsa.holtwinters import HoltWintersResults

model_fit_se = HoltWintersResults.initialize(‘model_se.pkl’,smoothing_level=0.8,smoothing_slope=0.2,optimized=False)

yhat = model_fit_se.forecast()[0]

print(‘Predicted: %.3f’ % yhat)

but I got this error:

TypeError: initialize() missing 2 required positional arguments: ‘model’ and ‘params’

I think its in the parameters parts….how do I fix this

I don’t have good advice sorry, perhaps try posting your code and error to stackoverflow?

Hi Jason,

Very Intuitive post! I am wondering if you know how to manipulate the optimal criteria regarding time windows. For example, I want to select a model that optimizes the sum of MSE of the next 12 period data instead of just the next period.

How could I achieve that based on your model?

Thanks ahead!

Thanks.

Yes, you can try a grid search and run your own evaluation on predictions via walk-forward validation.

I give an example:

https://machinelearningmastery.com/how-to-grid-search-triple-exponential-smoothing-for-time-series-forecasting-in-python/

Thanks Jason! I believe that post is a lifesaver for people who are struggling with finding a python function that is equivalent to Hyndman’ ETS function in R (Please correct me if I am wrong)

Just want to make sure that I understand this method correctly:

If I were to minimize the sum of next 12 period’s rmse, should I just make some changes in the function _walk_forward_validation_ to ensure it returns the sum of next 12 period’s rmse?

Thanks.

Yes.