A Gentle Introduction to Autocorrelation and Partial Autocorrelation

Autocorrelation and partial autocorrelation plots are heavily used in time series analysis and forecasting.

These are plots that graphically summarize the strength of a relationship with an observation in a time series with observations at prior time steps. The difference between autocorrelation and partial autocorrelation can be difficult and confusing for beginners to time series forecasting.

In this tutorial, you will discover how to calculate and plot autocorrelation and partial correlation plots with Python.

After completing this tutorial, you will know:

  • How to plot and review the autocorrelation function for a time series.
  • How to plot and review the partial autocorrelation function for a time series.
  • The difference between autocorrelation and partial autocorrelation functions for time series analysis.

Let’s get started.

Minimum Daily Temperatures Dataset

This dataset describes the minimum daily temperatures over 10 years (1981-1990) in the city Melbourne, Australia.

The units are in degrees Celsius and there are 3,650 observations. The source of the data is credited as the Australian Bureau of Meteorology.

Learn more and download the dataset from Dara Market.

Download the dataset and place it in your current working directory with the filename “daily-minimum-temperatures.csv‘”.

Note: The downloaded file contains some question mark (“?”) characters that must be removed before you can use the dataset. Open the file in a text editor and remove the “?” characters. Also remove any footer information in the file.

The example below will load the Minimum Daily Temperatures and graph the time series.

Running the example loads the dataset as a Pandas Series and creates a line plot of the time series.

Minimum Daily Temperatures Dataset Plot

Minimum Daily Temperatures Dataset Plot

Stop learning Time Series Forecasting the slow way!

Take my free 7-day email course and discover how to get started (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

Correlation and Autocorrelation

Statistical correlation summarizes the strength of the relationship between two variables.

We can assume the distribution of each variable fits a Gaussian (bell curve) distribution. If this is the case, we can use the Pearson’s correlation coefficient to summarize the correlation between the variables.

The Pearson’s correlation coefficient is a number between -1 and 1 that describes a negative or positive correlation respectively. A value of zero indicates no correlation.

We can calculate the correlation for time series observations with observations with previous time steps, called lags. Because the correlation of the time series observations is calculated with values of the same series at previous times, this is called a serial correlation, or an autocorrelation.

A plot of the autocorrelation of a time series by lag is called the AutoCorrelation Function, or the acronym ACF. This plot is sometimes called a correlogram or an autocorrelation plot.

Below is an example of calculating and plotting the autocorrelation plot for the Minimum Daily Temperatures using the plot_acf() function from the statsmodels library.

Running the example creates a 2D plot showing the lag value along the x-axis and the correlation on the y-axis between -1 and 1.

Confidence intervals are drawn as a cone. By default, this is set to a 95% confidence interval, suggesting that correlation values outside of this code are very likely a correlation and not a statistical fluke.

Autocorrelation Plot of the Minimum Daily Temperatures Dataset

Autocorrelation Plot of the Minimum Daily Temperatures Dataset

By default, all lag values are printed, which makes the plot noisy.

We can limit the number of lags on the x-axis to 50 to make the plot easier to read.

Autocorrelation Plot With Fewer Lags of the Minimum Daily Temperatures Dataset

Autocorrelation Plot With Fewer Lags of the Minimum Daily Temperatures Dataset

Partial Autocorrelation Function

A partial autocorrelation is a summary of the relationship between an observation in a time series with observations at prior time steps with the relationships of intervening observations removed.

The partial autocorrelation at lag k is the correlation that results after removing the effect of any correlations due to the terms at shorter lags.

— Page 81, Section 4.5.6 Partial Autocorrelations, Introductory Time Series with R.

The autocorrelation for an observation and an observation at a prior time step is comprised of both the direct correlation and indirect correlations. These indirect correlations are a linear function of the correlation of the observation, with observations at intervening time steps.

It is these indirect correlations that the partial autocorrelation function seeks to remove. Without going into the math, this is the intuition for the partial autocorrelation.

The example below calculates and plots a partial autocorrelation function for the first 50 lags in the Minimum Daily Temperatures dataset using the plot_pacf() from the statsmodels library.

Running the example creates a 2D plot of the partial autocorrelation for the first 50 lags.

Partial Autocorrelation Plot of the Minimum Daily Temperatures Dataset

Partial Autocorrelation Plot of the Minimum Daily Temperatures Dataset

Intuition for ACF and PACF Plots

Plots of the autocorrelation function and the partial autocorrelation function for a time series tell a very different story.

We can use the intuition for ACF and PACF above to explore some thought experiments.

Autoregression Intuition

Consider a time series that was generated by an autoregression (AR) process with a lag of k.

We know that the ACF describes the autocorrelation between an observation and another observation at a prior time step that includes direct and indirect dependence information.

This means we would expect the ACF for the AR(k) time series to be strong to a lag of k and the inertia of that relationship would carry on to subsequent lag values, trailing off at some point as the effect was weakened.

We know that the PACF only describes the direct relationship between an observation and its lag. This would suggest that there would be no correlation for lag values beyond k.

This is exactly the expectation of the ACF and PACF plots for an AR(k) process.

Moving Average Intuition

Consider a time series that was generated by a moving average (MA) process with a lag of k.

Remember that the moving average process is an autoregression model of the time series of residual errors from prior predictions. Another way to think about the moving average model is that it corrects future forecasts based on errors made on recent forecasts.

We would expect the ACF for the MA(k) process to show a strong correlation with recent values up to the lag of k, then a sharp decline to low or no correlation. By definition, this is how the process was generated.

For the PACF, we would expect the plot to show a strong relationship to the lag and a trailing off of correlation from the lag onwards.

Again, this is exactly the expectation of the ACF and PACF plots for an MA(k) process.

Further Reading

This section provides some resources for further reading about autocorrelation and partial autocorrelation for time series.

Summary

In this tutorial, you discovered how to calculate autocorrelation and partial autocorrelation plots for time series data with Python.

Specifically, you learned:

  • How to calculate and create an autocorrelation plot for time series data.
  • How to calculate and create a partial autocorrelation plot for time series data.
  • The difference and intuition for interpreting ACF and PACF plots.

Do you have any questions about this tutorial?
Ask your questions in the comments below and I will do my best to answer.

Want to Develop Time Series Forecasts with Python?

Introduction to Time Series Forecasting With Python

Develop Your Own Forecasts in Minutes

...with just a few lines of python code

Discover how in my new Ebook:
Introduction to Time Series Forecasting With Python

It covers self-study tutorials and end-to-end projects on topics like:
Loading data, visualization, modeling, algorithm tuning, and much more...

Finally Bring Time Series Forecasting to
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.

54 Responses to A Gentle Introduction to Autocorrelation and Partial Autocorrelation

  1. Yasha February 7, 2017 at 1:58 am #

    Unable to download the data set. Throws an error

  2. Yasha February 8, 2017 at 5:25 pm #

    Thanks šŸ™‚

  3. Yasha February 9, 2017 at 6:26 pm #

    Well I have I think a kind of stupid question. I am running the commands on Anaconda Prompt. When I hit the line series.plot() I get the error no integer in the data frame. Well I have checked the data set and it does contain integer values. Anything specific I am missing ?

    • Jason Brownlee February 10, 2017 at 9:51 am #

      Hi Yasha,

      Double check the data file, open in a text editor and make sure there is no footer data.

      Also, find and delete the “?” characters in the file.

      If you still get an error, let me know, paste it in the comments.

  4. Thibault July 27, 2017 at 12:16 am #

    Hi Jason,

    I am currently using a MLP model with sliding window in order to forecast future values of a time series. One hyper parameter that seems to be crucial is the size of the window (the number of input neurons for the net). I was wondering if this value must be derived by studying the ACF/PACF plots of the series (as it is usually done with ARIMA models) ? I struggle to find documentation which explicitly discusses about this hyper parameter.

    Thanks in advance

    • Jason Brownlee July 27, 2017 at 8:09 am #

      I would recommend an ACF/PACF analysis.

      I would also recommend grid searching window sizes.

  5. Jai August 15, 2017 at 3:15 pm #

    i think there is an error pointing to wrong a dataset in this post / tutorial.

    https://datamarket.com/data/set/2328/daily-rainfall-in-melbourne-australia-1981-1990#!ds=2328&display=line

  6. Kashif August 16, 2017 at 5:09 pm #

    Hi Sir, I am reposting this question, because I have already read this blog as well but I still have confusion.
    I am applying ARIMA model on my CDR dataset. I have checked that my data is non stationary (Augmented Dickey-Fuller test)
    ADF Statistic: -1.569036
    p-value: 0.499127
    Critical Values:
    1%: -3.478
    10%: -2.578
    5%: -2.882
    Than I have plotted ACF, it showed that my first 15 lag values have autocorrelation value greater then 0.5. so I set ā€˜pā€™ parameter 15 (is p =15 is correct ?), and ā€˜dā€™ is 1 for stationarity. Could you please guide me how I will find the value of moving average MA(q) q parameter ?
    Can I determine the value of ā€˜qā€™ parameter by visualizing ACF plot ?
    Thanks

  7. Kashif August 19, 2017 at 4:10 am #

    Hi Sir
    As I told you in my previous question, I am applying ARIMA model on my CDR dataset. I have determined that my series is stationary by dicky fuller test,
    Than I plotted ACF and PACF. In ACF plot my first 12 lags have AC value greater than 0.5, and in PACF plot first two lags have PAC value greater than 0.5 , first 4 lags have PAC values greater and equal to 0.2. Hence I have ‘p’ is equal to 12, ‘p’ is equal to 4 and d is 1 for stationarity (Should these parameters values are correct ?).
    Secondly, when I set these values ARIMA(12,1,4), I have following error,
    “ValueError: The computed initial AR coefficients are not stationary
    You should induce stationarity, choose a different model order, or you can
    pass your own start_params.”
    and when I set ARIMA(12,1,2), I have same error. why ?
    Third, when I use grid-search-arima, it shows best ARIMA(1,0,1) why ?
    I have read your many blogs but still confused.
    Could you please guide me ?
    Thanks

    • Jason Brownlee August 19, 2017 at 6:25 am #

      No idea, perhaps try a large d?

      Perhaps try preparing your data manually?

  8. Fero February 14, 2018 at 10:57 pm #

    Great explanation. Thank you!

  9. Elmar Bucher March 2, 2018 at 9:28 am #

    tank you! that was as nice introduction to enter the subject.
    best, bue

  10. Jordan Stein May 5, 2018 at 4:45 am #

    Good Stuff! Nice succinct tutorial.

  11. Hanif Han June 20, 2018 at 2:10 pm #

    Hi Jason,

    Since i can’t show the python chart using SSMS, i want to save the ACF & PACF as a png file Instead of showing it using pyplot.show().

    Would you please show me the way to do that? Thanks before šŸ™‚

  12. BIJU OOMMEN June 24, 2018 at 5:56 am #

    Thanks Jason for the article. I am new to machine learning and have very basic statistical knowledge. So I am not able interpret the graphs.
    The first graphs makes sense. But what does the other two graphs tell me “Autocorrelation Plot of the Minimum Daily Temperatures Dataset” and “Autocorrelation Plot With Fewer Lags of the Minimum Daily Temperatures Dataset” .
    Thanks in advance for your help

    • Jason Brownlee June 24, 2018 at 7:36 am #

      Perhaps check the section titled “Partial Autocorrelation Function”.

      Does that help?

  13. Felixis Felix July 17, 2018 at 4:57 pm #

    Thanks a million Jason.

  14. Davood July 19, 2018 at 7:13 am #

    Thanks a lot Jason. what you present in zour blog is really precious, what does the confidence interval represent? I dont fully understand why those ACF values out of the confidence interval are relevant! could you please explain?

  15. Nikos August 14, 2018 at 8:56 pm #

    Hi Jason,

    I was wondering we the confidence intervals above are cone-shaped instead of “normal” horizontal lines.
    Do you perhaps know the theoretical justification behind this?

    Best,
    Nikos

  16. Yifan August 25, 2018 at 7:38 am #

    Hi Jason,
    Could you explain in more basic terms what the moving average model does? How are the predictions made in the moving average model? I ask because you said the moving average model is a autoregression model on the errors in the predictions.

  17. Anil October 1, 2018 at 5:17 pm #

    Hi Jason,

    Can you please explan how to find p,q and d in best way

  18. Soukaina October 14, 2018 at 9:48 pm #

    Hi Jason, thank you for the tutorial, i just want to ask about choosing the p coefficient i have displayed the first 50 lags of my series and i find that the max value of correlation is on the lag 47 but when i tested it it show me an error message(
    ValueError: The computed initial AR coefficients are not stationary
    You should induce stationarity, choose a different model order, or you can
    pass your own start_params.)
    so whats the best method to choose p and how we construct the input list of p values of the method of grid search
    thank you

  19. Andy November 12, 2018 at 6:42 pm #

    Hi,

    when I try to compute autocorrelation in Python “by hand”:

    val = pd.DataFrame(series.values)
    lag = 2200
    shifted = val.shift(lag)
    dataframe = pd.concat([shifted, val], axis = 1)
    dataframe.columns = [‘t’, ‘t+lag’]
    result = dataframe.corr()
    print(result)

    for lag = 2200 I get corr = 0.554, while autocorrelations plot by plot_acf 1. decreases with lag and 2. is a the level of 0.25 for lag = 2200. Generally, my plot of correlations computed by Python differs significantly from plot_acf or autocorrelation.plot.

    What’s wrong with my computations? I just followed Your snippet from “Time Series Forecasting with Python” > p. 190, Listing 22.2

    Many thanks for the explanation in advance šŸ™‚

    Yours,

    Andy

    • Jason Brownlee November 13, 2018 at 5:44 am #

      The difference is that the ACF is calculating the correlation of each example with each t-1 and t-2, etc.

      You have only calculated the correlation for t-1.

  20. BB January 17, 2019 at 12:32 pm #

    Thanks for this post but i am still not gettting the intution of ACF & PACF.

    In the example you have considered for temperatures (especially last 50 days(lag)) for ACF there is strong correlation while PACF it trails of immediately which means the indirect measure is strong so ACF is trail of slowly while the direct measure is not that strong. If my understanding is right, what would i conclude ?

  21. Robert February 28, 2019 at 8:56 pm #

    My compliments on a very clear explanation.

  22. Sarah March 3, 2019 at 11:39 pm #

    Hi, first thanks for your all work. The point that i could not understand is how do we decide the order of arima model by using acf and pacf graphs?
    Thank you.

  23. Mike Schoeffler April 12, 2019 at 11:40 am #

    I get why the autocorrelation starts at 1, gradually goes negative when you move 182 days away and then maxes out at multiples of 365.

    But why does it gradually head to zero? You would expect two years out to have the same autocorrelation as one year.

    This same pattern is found with autocorrelation of sine waves too. No matter how many cycles, the autocorrelation alternates and heads to zero.

    What’s going on? Is the code filling unknown data (example: t-500 for the first data point) with zeros?

    • Jason Brownlee April 12, 2019 at 2:46 pm #

      Great question.

      The autocorrelation at 3 years is less than 2 years for the same day because obs at year 2 have more in common with year 3 than now. This makes sense.

      We want to remove this effect, and this is what the PCAF achieves.

      • Mike Schoeffler April 13, 2019 at 3:23 am #

        Why would April 12, 2019 temperatures have more in common with April 12, 2018 than 2017? And why would they have nothing in common with temperatures on April 12, 2009?

        Even worse, why would they have more in common with 2009 if the beginning of the data set is 1999?

        Thanks for the response – very much appreciate puzzling through this with you.

        • Jason Brownlee April 13, 2019 at 6:43 am #

          Great questions!

          Typically in time series there is a strong linear dependence between “close” observations. For seasonal data, “close” means one or more cycles away.

          If we look at it from a statistical perspective, most of the dependence can be explained by one or two close observations, with very little contribution fro observations beyond that.

          This makes sense, as we would expect data from today to be like yesterday more than last week. That last year is more like this year than 2 years ago. At least on average.

          Does that help?

          • Michael Schoeffler April 13, 2019 at 8:35 am #

            I’m still confused, but appreciate your patience.

            I get how today is more like yesterday than last week. It’s springtime and it was colder last week.

            I’m just confused about why the last April would be much more similar than the year before. Seems like roughly the same weather in April for thousands of years. But this shows zero similarity at the beginning of the time series.

            I checked this by generating a sine curve, where the pattern repeats precisely with every cycle – and still the same results.

            I’m missing something about what this is doing.

          • Jason Brownlee April 13, 2019 at 1:46 pm #

            Yes, but a series can and often do have general trends up or down across cycles.

            Again, we are not talking about absolute difference/similarity across obs. Instead, we are talking about how much of the signal can be explained (or is dependent in a simple linear model). We can explain a lot with the prior observation in the cycle, with diminishing returns after that.

  24. Michael Schoeffler April 14, 2019 at 10:18 pm #

    Maybe I’m starting to understand?

    Take the sine cycle (repeats regularly, similar to temps but without any noise), which displays the same cycle gradually declining to zero in autocorrelation.

    Let’s say autocorrelation is only looking at information that can’t be explained by prior observations (makes perfect sense). I’m thinking it should fall to zero after one cycle (365 days).

    Apologies if this is repetitive. I’m trying to understand because I believe a good understanding of this tool will be very helpful in a problem I’m trying to solve.

Leave a Reply