Autocorrelation and partial autocorrelation plots are heavily used in time series analysis and forecasting.

These are plots that graphically summarize the strength of a relationship with an observation in a time series with observations at prior time steps. The difference between autocorrelation and partial autocorrelation can be difficult and confusing for beginners to time series forecasting.

In this tutorial, you will discover how to calculate and plot autocorrelation and partial correlation plots with Python.

After completing this tutorial, you will know:

- How to plot and review the autocorrelation function for a time series.
- How to plot and review the partial autocorrelation function for a time series.
- The difference between autocorrelation and partial autocorrelation functions for time series analysis.

Let’s get started.

## Minimum Daily Temperatures Dataset

This dataset describes the minimum daily temperatures over 10 years (1981-1990) in the city Melbourne, Australia.

The units are in degrees Celsius and there are 3,650 observations. The source of the data is credited as the Australian Bureau of Meteorology.

Learn more and download the dataset from Dara Market.

Download the dataset and place it in your current working directory with the filename “*daily-minimum-temperatures.csv*‘”.

**Note**: The downloaded file contains some question mark (“?”) characters that must be removed before you can use the dataset. Open the file in a text editor and remove the “?” characters. Also remove any footer information in the file.

The example below will load the Minimum Daily Temperatures and graph the time series.

1 2 3 4 5 |
from pandas import Series from matplotlib import pyplot series = Series.from_csv('daily-minimum-temperatures.csv', header=0) series.plot() pyplot.show() |

Running the example loads the dataset as a Pandas Series and creates a line plot of the time series.

### Stop learning Time Series Forecasting the *slow way*!

Take my free 7-day email course and discover how to get started (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

## Correlation and Autocorrelation

Statistical correlation summarizes the strength of the relationship between two variables.

We can assume the distribution of each variable fits a Gaussian (bell curve) distribution. If this is the case, we can use the Pearson’s correlation coefficient to summarize the correlation between the variables.

The Pearson’s correlation coefficient is a number between -1 and 1 that describes a negative or positive correlation respectively. A value of zero indicates no correlation.

We can calculate the correlation for time series observations with observations with previous time steps, called lags. Because the correlation of the time series observations is calculated with values of the same series at previous times, this is called a serial correlation, or an autocorrelation.

A plot of the autocorrelation of a time series by lag is called the **A**uto**C**orrelation **F**unction, or the acronym ACF. This plot is sometimes called a correlogram or an autocorrelation plot.

Below is an example of calculating and plotting the autocorrelation plot for the Minimum Daily Temperatures using the plot_acf() function from the statsmodels library.

1 2 3 4 5 6 |
from pandas import Series from matplotlib import pyplot from statsmodels.graphics.tsaplots import plot_acf series = Series.from_csv('daily-minimum-temperatures.csv', header=0) plot_acf(series) pyplot.show() |

Running the example creates a 2D plot showing the lag value along the x-axis and the correlation on the y-axis between -1 and 1.

Confidence intervals are drawn as a cone. By default, this is set to a 95% confidence interval, suggesting that correlation values outside of this code are very likely a correlation and not a statistical fluke.

By default, all lag values are printed, which makes the plot noisy.

We can limit the number of lags on the x-axis to 50 to make the plot easier to read.

## Partial Autocorrelation Function

A partial autocorrelation is a summary of the relationship between an observation in a time series with observations at prior time steps with the relationships of intervening observations removed.

The partial autocorrelation at lag k is the correlation that results after removing the effect of any correlations due to the terms at shorter lags.

— Page 81, Section 4.5.6 Partial Autocorrelations, Introductory Time Series with R.

The autocorrelation for an observation and an observation at a prior time step is comprised of both the direct correlation and indirect correlations. These indirect correlations are a linear function of the correlation of the observation, with observations at intervening time steps.

It is these indirect correlations that the partial autocorrelation function seeks to remove. Without going into the math, this is the intuition for the partial autocorrelation.

The example below calculates and plots a partial autocorrelation function for the first 50 lags in the Minimum Daily Temperatures dataset using the plot_pacf() from the statsmodels library.

1 2 3 4 5 6 |
from pandas import Series from matplotlib import pyplot from statsmodels.graphics.tsaplots import plot_pacf series = Series.from_csv('daily-minimum-temperatures.csv', header=0) plot_pacf(series, lags=50) pyplot.show() |

Running the example creates a 2D plot of the partial autocorrelation for the first 50 lags.

## Intuition for ACF and PACF Plots

Plots of the autocorrelation function and the partial autocorrelation function for a time series tell a very different story.

We can use the intuition for ACF and PACF above to explore some thought experiments.

### Autoregression Intuition

Consider a time series that was generated by an autoregression (AR) process with a lag of *k*.

We know that the ACF describes the autocorrelation between an observation and another observation at a prior time step that includes direct and indirect dependence information.

This means we would expect the ACF for the AR(k) time series to be strong to a lag of k and the inertia of that relationship would carry on to subsequent lag values, trailing off at some point as the effect was weakened.

We know that the PACF only describes the direct relationship between an observation and its lag. This would suggest that there would be no correlation for lag values beyond *k*.

This is exactly the expectation of the ACF and PACF plots for an AR(k) process.

### Moving Average Intuition

Consider a time series that was generated by a moving average (MA) process with a lag of *k*.

Remember that the moving average process is an autoregression model of the time series of residual errors from prior predictions. Another way to think about the moving average model is that it corrects future forecasts based on errors made on recent forecasts.

We would expect the ACF for the MA(k) process to show a strong correlation with recent values up to the lag of k, then a sharp decline to low or no correlation. By definition, this is how the process was generated.

For the PACF, we would expect the plot to show a strong relationship to the lag and a trailing off of correlation from the lag onwards.

Again, this is exactly the expectation of the ACF and PACF plots for an MA(k) process.

## Further Reading

This section provides some resources for further reading about autocorrelation and partial autocorrelation for time series.

- Correlation and dependence on Wikipedia
- Autocorrelation on Wikipedia
- Correlogram on Wikipedia
- Partial autocorrelation function on Wikipedia.
- Section 3.2.5 Partial Autocorrelation function, Page 64, Time Series Analysis: Forecasting and Control.

## Summary

In this tutorial, you discovered how to calculate autocorrelation and partial autocorrelation plots for time series data with Python.

Specifically, you learned:

- How to calculate and create an autocorrelation plot for time series data.
- How to calculate and create a partial autocorrelation plot for time series data.
- The difference and intuition for interpreting ACF and PACF plots.

Do you have any questions about this tutorial?

Ask your questions in the comments below and I will do my best to answer.

Unable to download the data set. Throws an error

Sorry, I have fixed the link.

Thanks ðŸ™‚

You’re welcome Yasha.

Well I have I think a kind of stupid question. I am running the commands on Anaconda Prompt. When I hit the line series.plot() I get the error no integer in the data frame. Well I have checked the data set and it does contain integer values. Anything specific I am missing ?

Hi Yasha,

Double check the data file, open in a text editor and make sure there is no footer data.

Also, find and delete the “?” characters in the file.

If you still get an error, let me know, paste it in the comments.

Hi Jason,

I am currently using a MLP model with sliding window in order to forecast future values of a time series. One hyper parameter that seems to be crucial is the size of the window (the number of input neurons for the net). I was wondering if this value must be derived by studying the ACF/PACF plots of the series (as it is usually done with ARIMA models) ? I struggle to find documentation which explicitly discusses about this hyper parameter.

Thanks in advance

I would recommend an ACF/PACF analysis.

I would also recommend grid searching window sizes.

i think there is an error pointing to wrong a dataset in this post / tutorial.

https://datamarket.com/data/set/2328/daily-rainfall-in-melbourne-australia-1981-1990#!ds=2328&display=line

You’re right, I’ve fixed it.

Get the data here:

https://datamarket.com/data/set/2324/daily-minimum-temperatures-in-melbourne-australia-1981-1990#!ds=2324&display=line

Hi Jason,

i search through your posts / tutorial i just could not locate an example that use rainfall history. Do you have such post to predict / forecast ?

I have many posts on how to make predictions, for example:

http://machinelearningmastery.com/make-sample-forecasts-arima-python/

Hi Sir, I am reposting this question, because I have already read this blog as well but I still have confusion.

I am applying ARIMA model on my CDR dataset. I have checked that my data is non stationary (Augmented Dickey-Fuller test)

ADF Statistic: -1.569036

p-value: 0.499127

Critical Values:

1%: -3.478

10%: -2.578

5%: -2.882

Than I have plotted ACF, it showed that my first 15 lag values have autocorrelation value greater then 0.5. so I set â€˜pâ€™ parameter 15 (is p =15 is correct ?), and â€˜dâ€™ is 1 for stationarity. Could you please guide me how I will find the value of moving average MA(q) q parameter ?

Can I determine the value of â€˜qâ€™ parameter by visualizing ACF plot ?

Thanks

Yes, review the PACF plot and the ACF plots together and read the section “Moving Average Intuition”.

If that still does not help, try a grid search:

http://machinelearningmastery.com/grid-search-arima-hyperparameters-with-python/

Hi Jason,

Is it possible to actually make the review of PACF and ACF plot automatically?

Perhaps. Alternately, you can grid search p/q values for your model instead of using an analysis (my preferred approach).

Hi Sir

As I told you in my previous question, I am applying ARIMA model on my CDR dataset. I have determined that my series is stationary by dicky fuller test,

Than I plotted ACF and PACF. In ACF plot my first 12 lags have AC value greater than 0.5, and in PACF plot first two lags have PAC value greater than 0.5 , first 4 lags have PAC values greater and equal to 0.2. Hence I have ‘p’ is equal to 12, ‘p’ is equal to 4 and d is 1 for stationarity (Should these parameters values are correct ?).

Secondly, when I set these values ARIMA(12,1,4), I have following error,

“ValueError: The computed initial AR coefficients are not stationary

You should induce stationarity, choose a different model order, or you can

pass your own start_params.”

and when I set ARIMA(12,1,2), I have same error. why ?

Third, when I use grid-search-arima, it shows best ARIMA(1,0,1) why ?

I have read your many blogs but still confused.

Could you please guide me ?

Thanks

No idea, perhaps try a large d?

Perhaps try preparing your data manually?

Great explanation. Thank you!

I’m glad it helped.

tank you! that was as nice introduction to enter the subject.

best, bue

Thanks Elmar.

Good Stuff! Nice succinct tutorial.

Thanks.

Hi Jason,

Since i can’t show the python chart using SSMS, i want to save the ACF & PACF as a png file Instead of showing it using pyplot.show().

Would you please show me the way to do that? Thanks before ðŸ™‚

You can use:

Thanks Jason for the article. I am new to machine learning and have very basic statistical knowledge. So I am not able interpret the graphs.

The first graphs makes sense. But what does the other two graphs tell me “Autocorrelation Plot of the Minimum Daily Temperatures Dataset” and “Autocorrelation Plot With Fewer Lags of the Minimum Daily Temperatures Dataset” .

Thanks in advance for your help

Perhaps check the section titled “Partial Autocorrelation Function”.

Does that help?

Thanks a million Jason.

You’re welcome.

Thanks a lot Jason. what you present in zour blog is really precious, what does the confidence interval represent? I dont fully understand why those ACF values out of the confidence interval are relevant! could you please explain?

You can learn more about the confidence interval here:

https://machinelearningmastery.com/confidence-intervals-for-machine-learning/

Hi Jason,

I was wondering we the confidence intervals above are cone-shaped instead of “normal” horizontal lines.

Do you perhaps know the theoretical justification behind this?

Best,

Nikos

Your guess is as good as mine.

Hi Jason,

Could you explain in more basic terms what the moving average model does? How are the predictions made in the moving average model? I ask because you said the moving average model is a autoregression model on the errors in the predictions.

MA is an autoregression of the error component from each prior observation.

You can learn more here:

https://machinelearningmastery.com/model-residual-errors-correct-time-series-forecasts-python/

Hi Jason,

Can you please explan how to find p,q and d in best way

Yes, use a grid search:

https://machinelearningmastery.com/grid-search-arima-hyperparameters-with-python/

Hi Jason, thank you for the tutorial, i just want to ask about choosing the p coefficient i have displayed the first 50 lags of my series and i find that the max value of correlation is on the lag 47 but when i tested it it show me an error message(

ValueError: The computed initial AR coefficients are not stationary

You should induce stationarity, choose a different model order, or you can

pass your own start_params.)

so whats the best method to choose p and how we construct the input list of p values of the method of grid search

thank you

I recommend a grid search, here’s an example:

https://machinelearningmastery.com/grid-search-arima-hyperparameters-with-python/

Hi,

when I try to compute autocorrelation in Python “by hand”:

val = pd.DataFrame(series.values)

lag = 2200

shifted = val.shift(lag)

dataframe = pd.concat([shifted, val], axis = 1)

dataframe.columns = [‘t’, ‘t+lag’]

result = dataframe.corr()

print(result)

for lag = 2200 I get corr = 0.554, while autocorrelations plot by plot_acf 1. decreases with lag and 2. is a the level of 0.25 for lag = 2200. Generally, my plot of correlations computed by Python differs significantly from plot_acf or autocorrelation.plot.

What’s wrong with my computations? I just followed Your snippet from “Time Series Forecasting with Python” > p. 190, Listing 22.2

Many thanks for the explanation in advance ðŸ™‚

Yours,

Andy

The difference is that the ACF is calculating the correlation of each example with each t-1 and t-2, etc.

You have only calculated the correlation for t-1.

Thanks for this post but i am still not gettting the intution of ACF & PACF.

In the example you have considered for temperatures (especially last 50 days(lag)) for ACF there is strong correlation while PACF it trails of immediately which means the indirect measure is strong so ACF is trail of slowly while the direct measure is not that strong. If my understanding is right, what would i conclude ?

My compliments on a very clear explanation.

Thanks.

Hi, first thanks for your all work. The point that i could not understand is how do we decide the order of arima model by using acf and pacf graphs?

Thank you.

I explain more on how to interpret them here:

https://machinelearningmastery.com/gentle-introduction-box-jenkins-method-time-series-forecasting/

Also, I show how to grid search values (perhaps more reliable) here:

https://machinelearningmastery.com/grid-search-arima-hyperparameters-with-python/

I get why the autocorrelation starts at 1, gradually goes negative when you move 182 days away and then maxes out at multiples of 365.

But why does it gradually head to zero? You would expect two years out to have the same autocorrelation as one year.

This same pattern is found with autocorrelation of sine waves too. No matter how many cycles, the autocorrelation alternates and heads to zero.

What’s going on? Is the code filling unknown data (example: t-500 for the first data point) with zeros?

Great question.

The autocorrelation at 3 years is less than 2 years for the same day because obs at year 2 have more in common with year 3 than now. This makes sense.

We want to remove this effect, and this is what the PCAF achieves.

Why would April 12, 2019 temperatures have more in common with April 12, 2018 than 2017? And why would they have nothing in common with temperatures on April 12, 2009?

Even worse, why would they have more in common with 2009 if the beginning of the data set is 1999?

Thanks for the response – very much appreciate puzzling through this with you.

Great questions!

Typically in time series there is a strong linear dependence between “close” observations. For seasonal data, “close” means one or more cycles away.

If we look at it from a statistical perspective, most of the dependence can be explained by one or two close observations, with very little contribution fro observations beyond that.

This makes sense, as we would expect data from today to be like yesterday more than last week. That last year is more like this year than 2 years ago. At least on average.

Does that help?

I’m still confused, but appreciate your patience.

I get how today is more like yesterday than last week. It’s springtime and it was colder last week.

I’m just confused about why the last April would be much more similar than the year before. Seems like roughly the same weather in April for thousands of years. But this shows zero similarity at the beginning of the time series.

I checked this by generating a sine curve, where the pattern repeats precisely with every cycle – and still the same results.

I’m missing something about what this is doing.

Yes, but a series can and often do have general trends up or down across cycles.

Again, we are not talking about absolute difference/similarity across obs. Instead, we are talking about how much of the signal can be explained (or is dependent in a simple linear model). We can explain a lot with the prior observation in the cycle, with diminishing returns after that.

Maybe I’m starting to understand?

Take the sine cycle (repeats regularly, similar to temps but without any noise), which displays the same cycle gradually declining to zero in autocorrelation.

Let’s say autocorrelation is only looking at information that can’t be explained by prior observations (makes perfect sense). I’m thinking it should fall to zero after one cycle (365 days).

Apologies if this is repetitive. I’m trying to understand because I believe a good understanding of this tool will be very helpful in a problem I’m trying to solve.