How to Model Volatility with ARCH and GARCH for Time Series Forecasting in Python

A change in the variance or volatility over time can cause problems when modeling time series with classical methods like ARIMA.

The ARCH or Autoregressive Conditional Heteroskedasticity method provides a way to model a change in variance in a time series that is time dependent, such as increasing or decreasing volatility. An extension of this approach named GARCH or Generalized Autoregressive Conditional Heteroskedasticity allows the method to support changes in the time dependent volatility, such as increasing and decreasing volatility in the same series.

In this tutorial, you will discover the ARCH and GARCH models for predicting the variance of a time series.

After completing this tutorial, you will know:

  • The problem with variance in a time series and the need for ARCH and GARCH models.
  • How to configure ARCH and GARCH models.
  • How to implement ARCH and GARCH models in Python.

Let’s get started.

How to Develop ARCH and GARCH Models for Time Series Forecasting in Python

How to Develop ARCH and GARCH Models for Time Series Forecasting in Python
Photo by Murray Foubister, some rights reserved.

Tutorial Overview

This tutorial is divided into five parts; they are:

  1. Problem with Variance
  2. What Is an ARCH Model?
  3. What Is a GARCH Model?
  4. How to Configure ARCH and GARCH Models
  5. ARCH and GARCH Models in Python

Problem with Variance

Autoregressive models can be developed for univariate time series data that is stationary (AR), has a trend (ARIMA), and has a seasonal component (SARIMA).

One aspect of a univariate time series that these autoregressive models do not model is a change in the variance over time.

Classically, a time series with modest changes in variance can sometimes be adjusted using a power transform, such as by taking the Log or using a Box-Cox transform.

There are some time series where the variance changes consistently over time. In the context of a time series in the financial domain, this would be called increasing and decreasing volatility.

In time series where the variance is increasing in a systematic way, such as an increasing trend, this property of the series is called heteroskedasticity. It’s a fancy word from statistics that means changing or unequal variance across the series.

If the change in variance can be correlated over time, then it can be modeled using an autoregressive process, such as ARCH.

What Is an ARCH Model?

Autoregressive Conditional Heteroskedasticity, or ARCH, is a method that explicitly models the change in variance over time in a time series.

Specifically, an ARCH method models the variance at a time step as a function of the residual errors from a mean process (e.g. a zero mean).

The ARCH process introduced by Engle (1982) explicitly recognizes the difference between the unconditional and the conditional variance allowing the latter to change over time as a function of past errors.

Generalized autoregressive conditional heteroskedasticity, 1986.

A lag parameter must be specified to define the number of prior residual errors to include in the model. Using the notation of the GARCH model (discussed later), we can refer to this parameter as “q“. Originally, this parameter was called “p“, and is also called “p” in the arch Python package used later in this tutorial.

  • q: The number of lag squared residual errors to include in the ARCH model.

A generally accepted notation for an ARCH model is to specify the ARCH() function with the q parameter ARCH(q); for example, ARCH(1) would be a first order ARCH model.

The approach expects the series is stationary, other than the change in variance, meaning it does not have a trend or seasonal component. An ARCH model is used to predict the variance at future time steps.

[ARCH] are mean zero, serially uncorrelated processes with nonconstant variances conditional on the past, but constant unconditional variances. For such processes, the recent past gives information about the one-period forecast variance.

Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation, 1982.

In practice, this can be used to model the expected variance on the residuals after another autoregressive model has been used, such as an ARMA or similar.

The model should only be applied to a prewhitened residual series {e_t} that is uncorrelated and contains no trends or seasonal changes, such as might be obtained after fitting a satisfactory SARIMA model.

— Page 148, Introductory Time Series with R, 2009.

What Is a GARCH Model?

Generalized Autoregressive Conditional Heteroskedasticity, or GARCH, is an extension of the ARCH model that incorporates a moving average component together with the autoregressive component.

Specifically, the model includes lag variance terms (e.g. the observations if modeling the white noise residual errors of another process), together with lag residual errors from a mean process.

The introduction of a moving average component allows the model to both model the conditional change in variance over time as well as changes in the time-dependent variance. Examples include conditional increases and decreases in variance.

As such, the model introduces a new parameter “p” that describes the number of lag variance terms:

  • p: The number of lag variances to include in the GARCH model.
  • q: The number of lag residual errors to include in the GARCH model.

A generally accepted notation for a GARCH model is to specify the GARCH() function with the p and q parameters GARCH(p, q); for example GARCH(1, 1) would be a first order GARCH model.

A GARCH model subsumes ARCH models, where a GARCH(0, q) is equivalent to an ARCH(q) model.

For p = 0 the process reduces to the ARCH(q) process, and for p = q = 0 E(t) is simply white noise. In the ARCH(q) process the conditional variance is specified as a linear function of past sample variances only, whereas the GARCH(p, q) process allows lagged conditional variances to enter as well. This corresponds to some sort of adaptive learning mechanism.

Generalized autoregressive conditional heteroskedasticity, 1986.

As with ARCH, GARCH predicts the future variance and expects that the series is stationary, other than the change in variance, meaning it does not have a trend or seasonal component.

How to Configure ARCH and GARCH Models

The configuration for an ARCH model is best understood in the context of ACF and PACF plots of the variance of the time series.

This can be achieved by subtracting the mean from each observation in the series and squaring the result, or just squaring the observation if you’re already working with white noise residuals from another model.

If a correlogram appears to be white noise […], then volatility ca be detected by looking at the correlogram of the squared values since the squared values are equivalent to the variance (provided the series is adjusted to have a mean of zero).

— Pages 146-147, Introductory Time Series with R, 2009.

The ACF and PACF plots can then be interpreted to estimate values for p and q, in a similar way as is done for the ARMA model.

For more information on how to do this, see the post:

ARCH and GARCH Models in Python

In this section, we will look at how we can develop ARCH and GARCH models in Python using the arch library.

First, let’s prepare a dataset we can use for these examples.

Test Dataset

We can create a dataset with a controlled model of variance.

The simplest case would be a series of random noise where the mean is zero and the variance starts at 0.0 and steadily increases.

We can achieve this in Python using the gauss() function that generates a Gaussian random number with the specified mean and standard deviation.

We can plot the dataset to get an idea of how the linear change in variance looks. The complete example is listed below.

Running the example creates and plots the dataset. We can see the clear change in variance over the course of the series.

Line Plot of Dataset with Increasing Variance

Line Plot of Dataset with Increasing Variance

Autocorrelation

We know there is an autocorrelation in the variance of the contrived dataset.

Nevertheless, we can look at an autocorrelation plot to confirm this expectation. The complete example is listed below.

Running the example creates an autocorrelation plot of the squared observations. We see significant positive correlation in variance out to perhaps 15 lag time steps.

This might make a reasonable value for the parameter in the ARCH model.

Autocorrelation Plot of Data with Increasing Variance

Autocorrelation Plot of Data with Increasing Variance

ARCH Model

Developing an ARCH model involves three steps:

  1. Define the model
  2. Fit the model
  3. Make a forecast.

Before fitting and forecasting, we can split the dataset into a train and test set so that we can fit the model on the train and evaluate its performance on the test set.

A model can be defined by calling the arch_model() function. We can specify a model for the mean of the series: in this case mean=’Zero’ is an appropriate model. We can then specify the model for the variance: in this case vol=’ARCH’. We can also specify the lag parameter for the ARCH model: in this case p=15.

Note, in the arch library, the names of p and q parameters for ARCH/GARCH have been reversed.

The model can be fit on the data by calling the fit() function. There are many options on this function, although the defaults are good enough for getting started. This will return a fit model.

Finally, we can make a prediction by calling the forecast() function on the fit model. We can specify the horizon for the forecast.

In this case, we will predict the variance for the last 10 time steps of the dataset, and withhold them from the training of the model.

We can tie all of this together; the complete example is listed below.

Running the example defines and fits the model then predicts the variance for the last 10 time steps of the dataset.

A line plot is created comparing the series of expected variance to the predicted variance. Although the model was not tuned, the predicted variance looks reasonable.

Line Plot of Expected Variance to Predicted Variance using ARCH

Line Plot of Expected Variance to Predicted Variance using ARCH

GARCH Model

We can fit a GARCH model just as easily using the arch library.

The arch_model() function can specify a GARCH instead of ARCH model vol=’GARCH’ as well as the lag parameters for both.

The dataset may not be a good fit for a GARCH model given the linearly increasing variance, nevertheless, the complete example is listed below.

A plot of the expected and predicted variance is listed below.

Line Plot of Expected Variance to Predicted Variance using GARCH

Line Plot of Expected Variance to Predicted Variance using GARCH

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Papers and Books

API

Articles

Summary

In this tutorial, you discovered the ARCH and GARCH models for predicting the variance of a time series.

Specifically, you learned:

  • The problem with variance in a time series and the need for ARCH and GARCH models.
  • How to configure ARCH and GARCH models.
  • How to implement ARCH and GARCH models in Python.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Want to Develop Time Series Forecasts with Python?

Introduction to Time Series Forecasting With Python

Develop Your Own Forecasts in Minutes

...with just a few lines of python code

Discover how in my new Ebook:
Introduction to Time Series Forecasting With Python

It covers self-study tutorials and end-to-end projects on topics like:
Loading data, visualization, modeling, algorithm tuning, and much more...

Finally Bring Time Series Forecasting to
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.

15 Responses to How to Model Volatility with ARCH and GARCH for Time Series Forecasting in Python

  1. Elie Kawerk August 24, 2018 at 6:26 am #

    Hi Jason,

    You mentioned the need for PACF but you haven’t plotted it, isn’t PACF needed to determine q?

    Best,
    Elie K

  2. Anthony The Koala August 25, 2018 at 11:46 pm #

    When you talk about the predicting the variance of the model, isn’t the variance the square of std deviation? But in line 9 of your code either ARCH or GARCH models, you generated gaussian numbers with a mean and std dev, not a mean and std dev**2.
    Could you please elaborate on std dev, std dev**2 and variance in the context of this page.
    Thank you
    Anthony of Sydney

    • Jason Brownlee August 26, 2018 at 6:29 am #

      The contrived sample problem is just a context for the code demo. Don’t read too much into it.

  3. Sebastian August 26, 2018 at 2:34 am #

    Hi, Jason.
    I think your work with this blog is great!

    I have a conceptual question. So ARCH and GARCH are not useful in order to predict or forecast the following data values in a time series, but to forecast the variance that future data might have instead? i.e. in stock pricing forecasting, these methods wouldn’t show the future prices, but instead they would show the variance those future prices might have implied?

    Thanks a lot,

    Sebastián from Colombia.

    • Jason Brownlee August 26, 2018 at 6:30 am #

      Think of them more of a model of the variability of the series.

  4. Dhineshkumar R September 3, 2018 at 5:46 am #

    Hi Jason,

    Thanks for this great blog again.

    It would be helpful if you could tell me as to why we find ACF of Squared residuals and not the ACF of just residuals?

    Thanks.

    • Jason Brownlee September 3, 2018 at 6:17 am #

      It is mentioned in the post.

      The squared residuals are equivalent to the variance (e.g. the thing we are modeling).

  5. Lesego September 13, 2018 at 9:33 pm #

    Trying to follow the tutorial but can’t get the past the step of importing the arch_model module.

    I get the error “No module named ‘arch'” and I can’t find a solution online to fix it. Any help?

  6. yogesh October 10, 2018 at 5:03 am #

    Hi Jason, i wonder if this is not used for predicting or forecasting future value why would anyone use for variance ?
    i am quite confused please clear me.

    • Jason Brownlee October 10, 2018 at 6:17 am #

      ARCH models are only useful when you want to forecast volatility, not a value.

      • yogesh October 23, 2018 at 5:56 pm #

        ok thanks, but is there any chance we can use those variance to predict value in any way ?

  7. Sathya October 11, 2018 at 10:46 pm #

    Hi Jason, how did you come with p=15. isnt it supposed to be the same way you look into ACF and PACF plots and then decide p and q. In that case, typically, p and q should be less than 5 right?

    • Jason Brownlee October 12, 2018 at 6:40 am #

      From the post:

      We see significant positive correlation in variance out to perhaps 15 lag time steps.

Leave a Reply