A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python

Exponential smoothing is a time series forecasting method for univariate data that can be extended to support data with a systematic trend or seasonal component.

It is a powerful forecasting method that may be used as an alternative to the popular Box-Jenkins ARIMA family of methods.

In this tutorial, you will discover the exponential smoothing method for univariate time series forecasting.

After completing this tutorial, you will know:

  • What exponential smoothing is and how it is different from other forecasting methods.
  • The three main types of exponential smoothing and how to configure them.
  • How to implement exponential smoothing in Python.

Let’s get started.

A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python

A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python
Photo by Wolfgang Staudt, some rights reserved.

Tutorial Overview

This tutorial is divided into 4 parts; they are:

  1. What Is Exponential Smoothing?
  2. Types of Exponential Smoothing
  3. How to Configure Exponential Smoothing
  4. Exponential Smoothing in Python

What Is Exponential Smoothing?

Exponential smoothing is a time series forecasting method for univariate data.

Time series methods like the Box-Jenkins ARIMA family of methods develop a model where the prediction is a weighted linear sum of recent past observations or lags.

Exponential smoothing forecasting methods are similar in that a prediction is a weighted sum of past observations, but the model explicitly uses an exponentially decreasing weight for past observations.

Specifically, past observations are weighted with a geometrically decreasing ratio.

Forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older. In other words, the more recent the observation the higher the associated weight.

— Page 171, Forecasting: principles and practice, 2013.

Exponential smoothing methods may be considered as peers and an alternative to the popular Box-Jenkins ARIMA class of methods for time series forecasting.

Collectively, the methods are sometimes referred to as ETS models, referring to the explicit modeling of Error, Trend and Seasonality.

Types of Exponential Smoothing

There are three main types of exponential smoothing time series forecasting methods.

A simple method that assumes no systematic structure, an extension that explicitly handles trends, and the most advanced approach that add support for seasonality.

Single Exponential Smoothing

Single Exponential Smoothing, SES for short, also called Simple Exponential Smoothing, is a time series forecasting method for univariate data without a trend or seasonality.

It requires a single parameter, called alpha (a), also called the smoothing factor or smoothing coefficient.

This parameter controls the rate at which the influence of the observations at prior time steps decay exponentially. Alpha is often set to a value between 0 and 1. Large values mean that the model pays attention mainly to the most recent past observations, whereas smaller values mean more of the history is taken into account when making a prediction.

A value close to 1 indicates fast learning (that is, only the most recent values influence the forecasts), whereas a value close to 0 indicates slow learning (past observations have a large influence on forecasts).

— Page 89, Practical Time Series Forecasting with R, 2016.

Hyperparameters:

  • Alpha: Smoothing factor for the level.

Double Exponential Smoothing

Double Exponential Smoothing is an extension to Exponential Smoothing that explicitly adds support for trends in the univariate time series.

In addition to the alpha parameter for controlling smoothing factor for the level, an additional smoothing factor is added to control the decay of the influence of the change in trend called beta (b).

The method supports trends that change in different ways: an additive and a multiplicative, depending on whether the trend is linear or exponential respectively.

Double Exponential Smoothing with an additive trend is classically referred to as Holt’s linear trend model, named for the developer of the method Charles Holt.

  • Additive Trend: Double Exponential Smoothing with a linear trend.
  • Multiplicative Trend: Double Exponential Smoothing with an exponential trend.

For longer range (multi-step) forecasts, the trend may continue on unrealistically. As such, it can be useful to dampen the trend over time.

Dampening means reducing the size of the trend over future time steps down to a straight line (no trend).

The forecasts generated by Holt’s linear method display a constant trend (increasing or decreasing) indecently into the future. Even more extreme are the forecasts generated by the exponential trend method […] Motivated by this observation […] introduced a parameter that “dampens” the trend to a flat line some time in the future.

— Page 183, Forecasting: principles and practice, 2013.

As with modeling the trend itself, we can use the same principles in dampening the trend, specifically additively or multiplicatively for a linear or exponential dampening effect. A damping coefficient Phi (p) is used to control the rate of dampening.

  • Additive Dampening: Dampen a trend linearly.
  • Multiplicative Dampening: Dampen the trend exponentially.

Hyperparameters:

  • Alpha: Smoothing factor for the level.
  • Beta: Smoothing factor for the trend.
  • Trend Type: Additive or multiplicative.
  • Dampen Type: Additive or multiplicative.
  • Phi: Damping coefficient.

Triple Exponential Smoothing

Triple Exponential Smoothing is an extension of Exponential Smoothing that explicitly adds support for seasonality to the univariate time series.

This method is sometimes called Holt-Winters Exponential Smoothing, named for two contributors to the method: Charles Holt and Peter Winters.

In addition to the alpha and beta smoothing factors, a new parameter is added called gamma (g) that controls the influence on the seasonal component.

As with the trend, the seasonality may be modeled as either an additive or multiplicative process for a linear or exponential change in the seasonality.

  • Additive Seasonality: Triple Exponential Smoothing with a linear seasonality.
  • Multiplicative Seasonality: Triple Exponential Smoothing with an exponential seasonality.

Triple exponential smoothing is the most advanced variation of exponential smoothing and through configuration, it can also develop double and single exponential smoothing models.

Being an adaptive method, Holt-Winter’s exponential smoothing allows the level, trend and seasonality patterns to change over time.

— Page 95, Practical Time Series Forecasting with R, 2016.

Additionally, to ensure that the seasonality is modeled correctly, the number of time steps in a seasonal period (Period) must be specified. For example, if the series was monthly data and the seasonal period repeated each year, then the Period=12.

Hyperparameters:

  • Alpha: Smoothing factor for the level.
  • Beta: Smoothing factor for the trend.
  • Gamma: Smoothing factor for the seasonality.
  • Trend Type: Additive or multiplicative.
  • Dampen Type: Additive or multiplicative.
  • Phi: Damping coefficient.
  • Seasonality Type: Additive or multiplicative.
  • Period: Time steps in seasonal period.

How to Configure Exponential Smoothing

All of the model hyperparameters can be specified explicitly.

This can be challenging for experts and beginners alike.

Instead, it is common to use numerical optimization to search for and fund the smoothing coefficients (alpha, beta, gamma, and phi) for the model that result in the lowest error.

[…] a more robust and objective way to obtain values for the unknown parameters included in any exponential smoothing method is to estimate them from the observed data. […] the unknown parameters and the initial values for any exponential smoothing method can be estimated by minimizing the SSE [sum of the squared errors].

— Page 177, Forecasting: principles and practice, 2013.

The parameters that specify the type of change in the trend and seasonality, such as weather they are additive or multiplicative and whether they should be dampened, must be specified explicitly.

Exponential Smoothing in Python

This section looks at how to implement exponential smoothing in Python.

The implementations of Exponential Smoothing in Python are provided in the Statsmodels Python library.

The implementations are based on the description of the method in Rob Hyndman and George Athana­sopou­los’ excellent book “Forecasting: Principles and Practice,” 2013 and their R implementations in their “forecast” package.

Single Exponential Smoothing

Single Exponential Smoothing or simple smoothing can be implemented in Python via the SimpleExpSmoothing Statsmodels class.

First, an instance of the SimpleExpSmoothing class must be instantiated and passed the training data. The fit() function is then called providing the fit configuration, specifically the alpha value called smoothing_level. If this is not provided or set to None, the model will automatically optimize the value.

This fit() function returns an instance of the HoltWintersResults class that contains the learned coefficients. The forecast() or the predict() function on the result object can be called to make a forecast.

For example:

Double and Triple Exponential Smoothing

Single, Double and Triple Exponential Smoothing can be implemented in Python using the ExponentialSmoothing Statsmodels class.

First, an instance of the ExponentialSmoothing class must be instantiated, specifying both the training data and some configuration for the model.

Specifically, you must specify the following configuration parameters:

  • trend: The type of trend component, as either “add” for additive or “mul” for multiplicative. Modeling the trend can be disabled by setting it to None.
  • damped: Whether or not the trend component should be damped, either True or False.
  • seasonal: The type of seasonal component, as either “add” for additive or “mul” for multiplicative. Modeling the seasonal component can be disabled by setting it to None.
  • seasonal_periods: The number of time steps in a seasonal period, e.g. 12 for 12 months in a yearly seasonal structure (more here).

The model can then be fit on the training data by calling the fit() function.

This function allows you to either specify the smoothing coefficients of the exponential smoothing model or have them optimized. By default, they are optimized (e.g. optimized=True). These coefficients include:

  • smoothing_level (alpha): the smoothing coefficient for the level.
  • smoothing_slope (beta): the smoothing coefficient for the trend.
  • smoothing_seasonal (gamma): the smoothing coefficient for the seasonal component.
  • damping_slope (phi): the coefficient for the damped trend.

Additionally, the fit function can perform basic data preparation prior to modeling; specifically:

  • use_boxcox: Whether or not to perform a power transform of the series (True/False) or specify the lambda for the transform.

The fit() function will return an instance of the HoltWintersResults class that contains the learned coefficients. The forecast() or the predict() function on the result object can be called to make a forecast.

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Books

API

Articles

Summary

In this tutorial, you discovered the exponential smoothing method for univariate time series forecasting.

Specifically, you learned:

  • What exponential smoothing is and how it is different from other forecast methods.
  • The three main types of exponential smoothing and how to configure them.
  • How to implement exponential smoothing in Python.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Want to Develop Time Series Forecasts with Python?

Introduction to Time Series Forecasting With Python

Develop Your Own Forecasts in Minutes

...with just a few lines of python code

Discover how in my new Ebook:
Introduction to Time Series Forecasting With Python

It covers self-study tutorials and end-to-end projects on topics like:
Loading data, visualization, modeling, algorithm tuning, and much more...

Finally Bring Time Series Forecasting to
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.

14 Responses to A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python

  1. Sandeep August 20, 2018 at 10:18 pm #

    Hi Jason,

    Thank you very much for your post. This is very helpful resources. I would like to know how to install “statsmodels.tsa.holtwinters” as I see that it is throwing error when I ran the command :
    from statsmodels.tsa.holtwinters import ExponentialSmoothing

    It seems that statsmodels package do not have that command.
    Could you please help me in working that command?

    ThanK you,
    Sandeep

    • Jason Brownlee August 21, 2018 at 6:16 am #

      It really depends on your platform, for example:

      Alternately, try this tutorial:
      https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/

      • Ehsan E Nasiri August 23, 2018 at 2:15 am #

        Hello Jason,

        I am working on a forecasting project with a big dataset which includes 15 columns and around 9000 rows. The problem is I have to forecast the result for the next two years base on 14 columns of independent data, and the result should be binary(0,1).
        I saw many forecasting problems online, but most of them forecast base on just one column of independent data with no binary result.
        Is there any way to guide me or refer me any references to solve the problem?

        Thank you in advance,
        Ehsan

        • Jason Brownlee August 23, 2018 at 6:15 am #

          Yes, a neural network can easily forecast multiple variables, perhaps start with an MLP.

    • Satakarni October 30, 2018 at 6:56 pm #

      Hi Jason

      Thanks for your post.
      I am finding different results for DES method in R and python. Is Python ETS not a complete implementation as described in Hyndman et al (2008)? R ETS method have way too many flags to control? Kindly clarify
      ————–

      R-Code

      ——————
      > x

      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

      1 36 78 35 244 25 283 42 6 59 5 47 20

      2 0 0 5 38 16 143 14 37 60 2 55 0

      3 0

      > fit forecast::forecast(fit, h=1)

      Point Forecast Lo 80 Hi 80 Lo 95 Hi 95

      Feb 3 -2.728456 -96.36635 90.90943 -145.9353 140.4783

      >

      ———————
      Python
      ———————-

      dft
      Out[42]:
      quantity
      month_end
      2016-01-31 36
      2016-02-29 78
      2016-03-31 35
      2016-04-30 244
      2016-05-31 25
      2016-06-30 283
      2016-07-31 42
      2016-08-31 6
      2016-09-30 59
      2016-10-31 5
      2016-11-30 47
      2016-12-31 20
      2017-01-31 0
      2017-02-28 0
      2017-03-31 5
      2017-04-30 38
      2017-05-31 16
      2017-06-30 143
      2017-07-31 14
      2017-08-31 37
      2017-09-30 60
      2017-10-31 2
      2017-11-30 55
      2017-12-31 0
      holt_r = ets.ExponentialSmoothing(np.abs(dft), trend=’additive’, damped=False, seasonal=None).fit()
      C:\Anaconda\lib\site-packages\statsmodels\tsa\base\tsa_model.py:171: ValueWarning: No frequency information was provided, so inferred frequency M will be used.
      % freq, ValueWarning)

      holt_r.forecast(1)
      Out[44]:
      2018-01-31 13.049129
      Freq: M, dtype: float64

      • Jason Brownlee October 31, 2018 at 6:24 am #

        Sorry, I don’t know about the R implementation of ETS.

  2. Elie Kawerk August 20, 2018 at 11:52 pm #

    Hi Jason,

    Hyndman has published a new edition of ‘Forecasting, principles and practice’. It is available free of charge at: https://otexts.org/fpp2/ .

    Best,
    Elie

  3. Karmen August 21, 2018 at 2:25 pm #

    Thanks for this – clear, and gentle, with nice follow up resources!

  4. Sandeep August 24, 2018 at 9:16 am #

    Thanks for really nice and helpful matter on exponential smoothing.

  5. Ahmed Elshami October 6, 2018 at 9:52 am #

    Hi Dr. Jason,

    I was wondering about something.

    Let’s assume I smoothed my whole time series data, then I fit the model and did my prediction.
    my question is, should I unsmooth my prediction or not to calculate error?

    • Jason Brownlee October 6, 2018 at 11:44 am #

      If your goal of smoothing was to make the problem easier to learn, then no change is required.

      What would un-smoothing look like exactly? The addition of random noise?

Leave a Reply