11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)

Machine learning methods can be used for classification and forecasting on time series problems.

Before exploring machine learning methods for time series, it is a good idea to ensure you have exhausted classical linear time series forecasting methods. Classical time series forecasting methods may be focused on linear relationships, nevertheless, they are sophisticated and perform well on a wide range of problems, assuming that your data is suitably prepared and the method is well configured.

In this post, will you will discover a suite of classical methods for time series forecasting that you can test on your forecasting problem prior to exploring to machine learning methods.

The post is structured as a cheat sheet to give you just enough information on each method to get started with a working code example and where to look to get more information on the method.

All code examples are in Python and use the Statsmodels library. The APIs for this library can be tricky for beginners (trust me!), so having a working code example as a starting point will greatly accelerate your progress.

This is a large post; you may want to bookmark it.

Let’s get started.

11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)

11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)
Photo by Ron Reiring, some rights reserved.

Overview

This cheat sheet demonstrates 11 different classical time series forecasting methods; they are:

  1. Autoregression (AR)
  2. Moving Average (MA)
  3. Autoregressive Moving Average (ARMA)
  4. Autoregressive Integrated Moving Average (ARIMA)
  5. Seasonal Autoregressive Integrated Moving-Average (SARIMA)
  6. Seasonal Autoregressive Integrated Moving-Average with Exogenous Regressors (SARIMAX)
  7. Vector Autoregression (VAR)
  8. Vector Autoregression Moving-Average (VARMA)
  9. Vector Autoregression Moving-Average with Exogenous Regressors (VARMAX)
  10. Simple Exponential Smoothing (SES)
  11. Holt Winter’s Exponential Smoothing (HWES)

Did I miss your favorite classical time series forecasting method?
Let me know in the comments below.

Each method is presented in a consistent manner.

This includes:

  • Description. A short and precise description of the technique.
  • Python Code. A short working example of fitting the model and making a prediction in Python.
  • More Information. References for the API and the algorithm.

Each code example is demonstrated on a simple contrived dataset that may or may not be appropriate for the method. Replace the contrived dataset with your data in order to test the method.

Remember: each method will require tuning to your specific problem. In many cases, I have examples of how to configure and even grid search parameters on the blog already, try the search function.

If you find this cheat sheet useful, please let me know in the comments below.

Autoregression (AR)

The autoregression (AR) method models the next step in the sequence as a linear function of the observations at prior time steps.

The notation for the model involves specifying the order of the model p as a parameter to the AR function, e.g. AR(p). For example, AR(1) is a first-order autoregression model.

The method is suitable for univariate time series without trend and seasonal components.

Python Code

More Information

Moving Average (MA)

The moving average (MA) method models the next step in the sequence as a linear function of the residual errors from a mean process at prior time steps.

A moving average model is different from calculating the moving average of the time series.

The notation for the model involves specifying the order of the model q as a parameter to the MA function, e.g. MA(q). For example, MA(1) is a first-order moving average model.

The method is suitable for univariate time series without trend and seasonal components.

Python Code

We can use the ARMA class to create an MA model and setting a zeroth-order AR model. We must specify the order of the MA model in the order argument.

More Information

Autoregressive Moving Average (ARMA)

The Autoregressive Moving Average (ARMA) method models the next step in the sequence as a linear function of the observations and resiudal errors at prior time steps.

It combines both Autoregression (AR) and Moving Average (MA) models.

The notation for the model involves specifying the order for the AR(p) and MA(q) models as parameters to an ARMA function, e.g. ARMA(p, q). An ARIMA model can be used to develop AR or MA models.

The method is suitable for univariate time series without trend and seasonal components.

Python Code

More Information

Autoregressive Integrated Moving Average (ARIMA)

The Autoregressive Integrated Moving Average (ARIMA) method models the next step in the sequence as a linear function of the differenced observations and residual errors at prior time steps.

It combines both Autoregression (AR) and Moving Average (MA) models as well as a differencing pre-processing step of the sequence to make the sequence stationary, called integration (I).

The notation for the model involves specifying the order for the AR(p), I(d), and MA(q) models as parameters to an ARIMA function, e.g. ARIMA(p, d, q). An ARIMA model can also be used to develop AR, MA, and ARMA models.

The method is suitable for univariate time series with trend and without seasonal components.

Python Code

More Information

Seasonal Autoregressive Integrated Moving-Average (SARIMA)

The Seasonal Autoregressive Integrated Moving Average (SARIMA) method models the next step in the sequence as a linear function of the differenced observations, errors, differenced seasonal observations, and seasonal errors at prior time steps.

It combines the ARIMA model with the ability to perform the same autoregression, differencing, and moving average modeling at the seasonal level.

The notation for the model involves specifying the order for the AR(p), I(d), and MA(q) models as parameters to an ARIMA function and AR(P), I(D), MA(Q) and m parameters at the seasonal level, e.g. SARIMA(p, d, q)(P, D, Q)m where “m” is the number of time steps in each season (the seasonal period). A SARIMA model can be used to develop AR, MA, ARMA and ARIMA models.

The method is suitable for univariate time series with trend and/or seasonal components.

Python Code

More Information

Seasonal Autoregressive Integrated Moving-Average with Exogenous Regressors (SARIMAX)

The Seasonal Autoregressive Integrated Moving-Average with Exogenous Regressors (SARIMAX) is an extension of the SARIMA model that also includes the modeling of exogenous variables.

Exogenous variables are also called covariates and can be thought of as parallel input sequences that have observations at the same time steps as the original series. The primary series may be referred to as endogenous data to contrast it from the exogenous sequence(s). The observations for exogenous variables are included in the model directly at each time step and are not modeled in the same way as the primary endogenous sequence (e.g. as an AR, MA, etc. process).

The SARIMAX method can also be used to model the subsumed models with exogenous variables, such as ARX, MAX, ARMAX, and ARIMAX.

The method is suitable for univariate time series with trend and/or seasonal components and exogenous variables.

Python Code

More Information

Vector Autoregression (VAR)

The Vector Autoregression (VAR) method models the next step in each time series using an AR model. It is the generalization of AR to multiple parallel time series, e.g. multivariate time series.

The notation for the model involves specifying the order for the AR(p) model as parameters to a VAR function, e.g. VAR(p).

The method is suitable for multivariate time series without trend and seasonal components.

Python Code

More Information

Vector Autoregression Moving-Average (VARMA)

The Vector Autoregression Moving-Average (VARMA) method models the next step in each time series using an ARMA model. It is the generalization of ARMA to multiple parallel time series, e.g. multivariate time series.

The notation for the model involves specifying the order for the AR(p) and MA(q) models as parameters to a VARMA function, e.g. VARMA(p, q). A VARMA model can also be used to develop VAR or VMA models.

The method is suitable for multivariate time series without trend and seasonal components.

Python Code

More Information

Vector Autoregression Moving-Average with Exogenous Regressors (VARMAX)

The Vector Autoregression Moving-Average with Exogenous Regressors (VARMAX) is an extension of the VARMA model that also includes the modeling of exogenous variables. It is a multivariate version of the ARMAX method.

Exogenous variables are also called covariates and can be thought of as parallel input sequences that have observations at the same time steps as the original series. The primary series(es) are referred to as endogenous data to contrast it from the exogenous sequence(s). The observations for exogenous variables are included in the model directly at each time step and are not modeled in the same way as the primary endogenous sequence (e.g. as an AR, MA, etc. process).

The VARMAX method can also be used to model the subsumed models with exogenous variables, such as VARX and VMAX.

The method is suitable for univariate time series without trend and seasonal components and exogenous variables.

Python Code

More Information

Simple Exponential Smoothing (SES)

The Simple Exponential Smoothing (SES) method models the next time step as an exponentially weighted linear function of observations at prior time steps.

The method is suitable for univariate time series without trend and seasonal components.

Python Code

More Information

Holt Winter’s Exponential Smoothing (HWES)

The Holt Winter’s Exponential Smoothing (HWES) also called the Triple Exponential Smoothing method models the next time step as an exponentially weighted linear function of observations at prior time steps, taking trends and seasonality into account.

The method is suitable for univariate time series with trend and/or seasonal components.

Python Code

More Information

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Summary

In this post, you discovered a suite of classical time series forecasting methods that you can test and tune on your time series dataset.

Did I miss your favorite classical time series forecasting method?
Let me know in the comments below.

Did you try any of these methods on your dataset?
Let me know about your findings in the comments.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Want to Develop Time Series Forecasts with Python?

Introduction to Time Series Forecasting With Python

Develop Your Own Forecasts in Minutes

...with just a few lines of python code

Discover how in my new Ebook:
Introduction to Time Series Forecasting With Python

It covers self-study tutorials and end-to-end projects on topics like:
Loading data, visualization, modeling, algorithm tuning, and much more...

Finally Bring Time Series Forecasting to
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.

54 Responses to 11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)

  1. Adriena Welch August 6, 2018 at 3:20 pm #

    Hi Jason, thanks for such an excellent and comprehensive post on time series. I sincerely appreciate your effort. As you ask for the further topic, just wondering if I can request you for a specific topic I have been struggling to get an output. It’s about Structural Dynamic Factor model ( SDFM) by Barigozzi, M., Conti, A., and Luciani, M. (Do euro area countries respond asymmetrically to the common monetary policy) and Mario Forni Luca Gambetti (The Dynamic Effects of Monetary Policy: A Structural Factor Model Approach). Would it be possible for you to go over and estimate these two models using Python or R? It’s just a request from me and sorry if it doesn’t go with your interest.

    • Jason Brownlee August 7, 2018 at 6:23 am #

      Thanks for the suggestion. I’ve not heard of that method before.

  2. Kamal Singh August 6, 2018 at 6:19 pm #

    I am working on Time series or Prediction with neural network and SVR, I want to this in matlab by scratch can you give me the references of this materials
    Thank you in advance

    • Jason Brownlee August 7, 2018 at 6:26 am #

      Sorry, I don’t have any materials for matlab, it is only really used in universities.

  3. Catalin August 6, 2018 at 8:50 pm #

    Hi Jason! From which editor do you import the python code into the webpage of your article? Or what kind of container it that windowed control used to display the python code?

  4. Mike August 7, 2018 at 2:28 am #

    Thanks for all the things to try!

    I recently stumbled over some tasks where the classic algorithms like linear regression or decision trees outperformed even sophisticated NNs. Especially when boosted or averaged out with each other.

    Maybe its time to try the same with time series forecasting as I’m not getting good results for some tasks with an LSTM.

    • Jason Brownlee August 7, 2018 at 6:30 am #

      Always start with simple methods before trying more advanced methods.

      The complexity of advanced methods just be justified by additional predictive skill.

  5. Elie Kawerk August 7, 2018 at 2:36 am #

    Hi Jason,

    Thanks for this nice post!

    You’ve imported the sin function from math many times but have not used it.

    I’d like to see more posts about GARCH, ARCH and co-integration models.

    Best,
    Elie

    • Jason Brownlee August 7, 2018 at 6:30 am #

      Thanks, fixed.

      I have a post on ARCH (and friends) scheduled.

  6. Elie Kawerk August 7, 2018 at 2:38 am #

    Will you consider writing a follow-up book on advanced time-series models soon?

    • Jason Brownlee August 7, 2018 at 6:32 am #

      Yes, it is written. I am editing it now. The title will be “Deep Learning for Time Series Forecasting”.

      CNNs are amazing at time series, and CNNs + LSTMs together are really great.

      • Elie Kawerk August 7, 2018 at 6:40 am #

        will the new book cover classical time-series models like VAR, GARCH, ..?

        • Jason Brownlee August 7, 2018 at 2:29 pm #

          The focus is deep learning (MLP, CNN and LSTM) with tutorials on how to get the most from classical methods (Naive, SARIMA, ETS) before jumping into deep learning methods. I hope to have it done by the end of the month.

          • Elie Kawerk August 7, 2018 at 5:02 pm #

            This is great news! Don’t you think that R is better suited than Python for classical time-series models?

          • Jason Brownlee August 8, 2018 at 6:15 am #

            Perhaps generally, but not if you are building a system for operational use. I think Python is a better fit.

          • Dark7wind August 9, 2018 at 7:16 am #

            Great to hear this news. May I ask if the book also cover the topic of multivariate and multistep?

          • Jason Brownlee August 9, 2018 at 7:34 am #

            Yes, there are many chapters on multi-step and most chapters work with multivariate data.

      • Søren August 7, 2018 at 10:27 pm #

        Sounds amazing that you finally 😉 are geting the new book out on time-series models – when will it be available to buy?

        • Jason Brownlee August 8, 2018 at 6:20 am #

          Thanks. I hope by the end of the month or soon after.

  7. Arun Mishra August 10, 2018 at 5:25 am #

    I use Prophet.
    https://facebook.github.io/prophet/docs/quick_start.html

    Also, sometimes FastFourier Transformations gives a good result.

    • Jason Brownlee August 10, 2018 at 6:21 am #

      Thanks.

      • AJ Rader August 16, 2018 at 7:11 am #

        I would second the use of prophet, especially in the context of shock events — this is where this approach has a unique advantage.

  8. Ravi Rokhade August 10, 2018 at 5:19 pm #

    What are the typical application domain of these algos?

  9. Alberto Garcia Galindo August 11, 2018 at 12:14 am #

    Hi Jason!
    Firstly I congratulate you for your blog. It is helping me a lot in my final work on my bachelor’s degree in Statistics!
    What are the assumptions for make forecasting on time series using Machine Learning algorithms? For example, it must to be stationary? Thanks!

    • Jason Brownlee August 11, 2018 at 6:11 am #

      Gaussian error, but they work anyway if you violate assumptions.

      The methods like SARIMA/ETS try to make the series stationary as part of modeling (e.g. differencing).

      You may want to look at power transforms to make data more Gaussian.

  10. Neeraj August 12, 2018 at 4:55 pm #

    Hi Jason
    I’m interested in forecasting the temperatures
    I’m provided with the previous data of the temperature
    Can you suggest me the procedure I should follow in order to solve this problem

    • Jason Brownlee August 13, 2018 at 6:15 am #

      Yes, an SARIMA model would be a great place to start.

  11. Den August 16, 2018 at 12:15 am #

    Hey Jason,

    Cool stuff as always. Kudos to you for making me a ML genius!

    Real quick:
    How would you combine VARMAX with an SVR in python?

    Elaboration.
    Right now I am trying to predict a y-value, and have x1…xn variables.
    The tricky part is, the rows are grouped.
    So, for example.

    If the goal is to predict the price of a certain car in the 8th year, and I have data for 1200 cars, and for each car I have x11_xnm –> y1_xm data (meaning that let’s say car_X has data until m=10 years and car_X2 has data until m=3 years, for example).

    First I divide the data with the 80/20 split, trainset/testset, here the first challenge arises. How to make the split?? I chose to split the data based on the car name, then for each car I gathered the data for year 1 to m. (If this approach is wrong, please tell me) The motivation behind this, is that the 80/20 could otherwise end up with data of all the cars of which some would have all the years and others would have none of the years. aka a very skewed distribution.

    Then I create a model using an SVR, with some parameters.
    And then I try to predict the y-values of a certain car. (value in year m)

    However, I do not feel as if I am using the time in my prediction. Therefore, I turned to VARMAX.

    Final question(s).
    How do you make a time series prediction if you have multiple groups [in this case 1200 cars, each of which have a variable number of years(rows)] to make the model from?
    Am I doing right by using the VARMAX or could you tell me a better approach?

    Sorry for the long question and thank you for your patience!

    Best,

    Den

    • Jason Brownlee August 16, 2018 at 6:09 am #

      You can try model per group or across groups. Try both and see what works best.

      Compare a suite of ml methods to varmax and use what performs the best on your dataset.

  12. Petrônio Cândido August 16, 2018 at 6:36 am #

    Hi Jason!

    Excellent post! I also would like to invite you to know the Fuzzy Time Series, which are data driven, scalable and interpretable methods to analyze and forecast time series data. I have recently published a python library for that on http://petroniocandido.github.io/pyFTS/ .

    All feedbacks are welcome! Thanks in advance!

  13. Chris Phillips August 30, 2018 at 8:19 am #

    Hi Jason,

    Thank you so much for the many code examples on your site. I am wondering if you can help an amatur like me on something.

    When I pull data from our database, I generally do it for multiple SKU’s at the same time into a large table. Considering that there are thousands of unique SKU’s in the table, is there a methodology you would recommend for generating a forecast for each individual SKU? My initial thought is to run a loop and say something to the effect of: For each in SKU run…Then the VAR Code or the SARIMA code.

    Ideally I’d love to use SARIMA, as I think this works the best for the data I am looking to forecast, but if that is only available to one SKU at a time and VAR is not constrained by this, it will work as well. If there is a better methodology that you know of for these, I would gladly take this advice as well!

    Thank you so much!

  14. Eric September 6, 2018 at 6:32 am #

    Great post. I’m currently investigating a state space approach to forecasting. Dynamic Linear Modeling using a Kálmán Filter algorithm (West, Hamilton). There is a python package, pyDLM, that looks promising, but it would be great to hear your thoughts on this package and this approach.

    • Jason Brownlee September 6, 2018 at 2:07 pm #

      Sounds good, I hope to cover state space methods in the future. To be honest, I’ve had limited success but also limited exposure with the methods.

      Not familiar with the lib. Let me know how you go with it.

  15. Roberto Tomás September 27, 2018 at 7:38 am #

    Hi Jason, I noticed using VARMAX that I had to remove seasonality — enforcing stationarity .. now I have test and predictions data that I cannot plot (I can, but it doesn’t look right _at all_). I’m wondering if there are any built-ins that handle translation to and from seasonality for me? My notebook is online: https://nbviewer.jupyter.org/github/robbiemu/location-metric-data/blob/master/appData%20and%20locationData.ipynb

  16. Sara October 2, 2018 at 7:36 am #

    Thanks for your great tutorial posts. This one was very helpful. I am wondering if there is any method that is suitable for multivariate time series with a trend or/and seasonal components?

    • Jason Brownlee October 2, 2018 at 11:03 am #

      Yes, you can try MLPs, CNNs and LSTMs.

      You can experiment with each with and without data prep to make the series stationary.

      • Sara October 3, 2018 at 1:48 am #

        Thanks for your respond. I also have another question I would appreciate if you help me.
        I have a dataset which includes multiple time series variables which are not stationary and seems that these variables are not dependent on each other. I tried ARIMA for each variable column, also VAR for the pair of variables, I expected to get better result with ARIMA model (for non-stationarity of time series) but VAR provides much better prediction. Do you have any thought why?

        • Jason Brownlee October 3, 2018 at 6:20 am #

          No, go with the method that gives the best performance.

  17. Eric October 17, 2018 at 9:52 am #

    Hi Jason,

    In the (S/V)ARIMAX procedure, should I check to see if my exogenous regressors are stationary and difference if them if necessary before fitting?

    Y = data2 = [x + random() for x in range(101, 200)]
    X = data1 = [x + random() for x in range(1, 100)]

    If I don’t, then I can’t tell if a change in X is related to a change in Y, or if they are both just trending with time. The time trend dominates as 0 <= random() <= 1

    In R, Hyndman recommends "[differencing] all variables first as estimation of a model with non-stationary errors is not consistent and can lead to “spurious regression”".

    https://robjhyndman.com/hyndsight/arimax/

    Does SARIMAX handle this automatically or flag me if I have non-stationary regressors?

    Thanks

    • Jason Brownlee October 17, 2018 at 2:27 pm #

      No, the library will not do this for you. Differencing is only performed on the provided series, not the exogenous variables.

      Perhaps try with and without and use the approach that results in the lowest forecast error for your specific dataset.

  18. Andrew K October 23, 2018 at 9:09 am #

    Hi Jason,

    Thank you for this wonderful tutorial.

    I do have a question regarding data that isn’t continuous, for example, data that can only be measured during daylight hours. How would you approach a time series analysis (forecasting) with data that has this behavior? Fill non-daylight hour data with 0’s or nan’s?

    Thanks.

  19. Khalifa Ali October 23, 2018 at 4:48 pm #

    Hey..
    Kindly Help us in making hybrid forecasting techniques.
    Using two forecasting technique and make a hybrid technique from them.
    Like you may use any two techniques mentioned above and make a hybrid technique form them.
    Thanks.

    • Jason Brownlee October 24, 2018 at 6:25 am #

      Sure, what problem are you having with using multiple methods exactly?

  20. Mohammad Alzyout October 31, 2018 at 6:25 pm #

    Thank you for your excellent and clear tutorial.

    I wondered which is the best way to forecast the next second Packet Error Rate in DSRC network for safety messages exchange between vehicles to decide the best distribution over Access Categories of EDCA.

    I hesitated to choose between LSTM or ARMA methodology.

    Could you please guide me to the better method of them ?

    Kindly, note that I’m beginner in both methods and want to decide the best one to go deep with it because I don’t have enouph time to learn both methods especially they are as I think from different backgrounds.

    Thank you in advance.

    Best regards,
    Mohammad.

    • Jason Brownlee November 1, 2018 at 6:03 am #

      I recommend testing a suite of methods in order to discover what works best for your specific problem.

  21. Jawad November 8, 2018 at 12:33 am #

    Hi Jason,
    Thanks for great post. I have 2 questions. First, is there a way to calculate confidence intervals in HWES, because i could not find any way in the documentation. And second, do we have something like ‘nnetar’ R’s neural network package for time series forecasting available in python.
    Regards

Leave a Reply