How to Save an ARIMA Time Series Forecasting Model in Python

By Jason Brownlee on December 10, 2020 in Time Series 30

The Autoregressive Integrated Moving Average Model, or ARIMA, is a popular linear model for time series analysis and forecasting.

The statsmodels library provides an implementation of ARIMA for use in Python. ARIMA models can be saved to file for later use in making predictions on new data. There is a bug in the current version of the statsmodels library that prevents saved models from being loaded.

In this tutorial, you will discover how to diagnose and work around this issue.

Kick-start your project with my new book Time Series Forecasting With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Updated Apr/2019: Updated the link to dataset.
Updated Aug/2019: Updated data loading to use new API.

NOTE: The bug discussed in this tutorial appears to
have been fixed in statsmodels version 0.12.1.

How to Save an ARIMA Time Series Forecasting Model in Python
Photo by Les Chatfield, some rights reserved.

Daily Female Births Dataset

First, let’s look at a standard time series dataset we can use to understand the problem with the statsmodels ARIMA implementation.

This Daily Female Births dataset describes the number of daily female births in California in 1959.

The units are a count and there are 365 observations. The source of the dataset is credited to Newton (1988).

Download the dataset.

Download the dataset and place it in your current working directory with the filename “daily-total-female-births.csv“.

The code snippet below will load and plot the dataset.

from pandas import read_csv
from matplotlib import pyplot
series = read_csv('daily-total-female-births.csv', header=0, index_col=0))
series.plot()
pyplot.show()

from pandas import read_csv

from matplotlib import pyplot

series = read_csv('daily-total-female-births.csv', header=0, index_col=0))

series.plot()

pyplot.show()

Running the example loads the dataset as a Pandas Series, then shows a line plot of the data.

Daily Total Female Births Plot

Stop learning Time Series Forecasting the slow way!

Take my free 7-day email course and discover how to get started (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Python Environment

Confirm you are using the latest version of the statsmodels library.

You can do that by running the script below:

import statsmodels
print('statsmodels: %s' % statsmodels.__version__)

1 2	import statsmodels print('statsmodels: %s' % statsmodels.__version__)

Running the script should produce a result showing statsmodels 0.6 or 0.6.1.

statsmodels: 0.6.1

1	statsmodels: 0.6.1

You can use either Python 2 or 3.

NOTE: The bug discussed in this tutorial appears to
have been fixed in statsmodels version 0.12.1.

ARIMA Model Save Bug

We can easily train an ARIMA model on the Daily Female Births dataset.

The code snippet below trains an ARIMA(1,1,1) on the dataset.

The model.fit() function returns an ARIMAResults object on which we can call save() to save the model to file and load() to later load it.

from pandas import read_csv
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.arima_model import ARIMAResults
# load data
series = read_csv('daily-total-female-births.csv', header=0, index_col=0))
# prepare data
X = series.values
X = X.astype('float32')
# fit model
model = ARIMA(X, order=(1,1,1))
model_fit = model.fit()
# save model
model_fit.save('model.pkl')
# load model
loaded = ARIMAResults.load('model.pkl')

from pandas import read_csv

from statsmodels.tsa.arima_model import ARIMA

from statsmodels.tsa.arima_model import ARIMAResults

# load data

series = read_csv('daily-total-female-births.csv', header=0, index_col=0))

# prepare data

X = series.values

X = X.astype('float32')

# fit model

model = ARIMA(X, order=(1,1,1))

model_fit = model.fit()

# save model

model_fit.save('model.pkl')

# load model

loaded = ARIMAResults.load('model.pkl')

Running this example will train the model and save it to file without problem.

An error will be reported when you try to load the model from file.

Traceback (most recent call last):
  File "...", line 16, in <module>
    loaded = ARIMAResults.load('model.pkl')
  File ".../site-packages/statsmodels/base/model.py", line 1529, in load
    return load_pickle(fname)
  File ".../site-packages/statsmodels/iolib/smpickle.py", line 41, in load_pickle
    return cPickle.load(fin)
TypeError: __new__() takes at least 3 arguments (1 given)

Traceback (most recent call last):

File "...", line 16, in <module>

loaded = ARIMAResults.load('model.pkl')

File ".../site-packages/statsmodels/base/model.py", line 1529, in load

return load_pickle(fname)

File ".../site-packages/statsmodels/iolib/smpickle.py", line 41, in load_pickle

return cPickle.load(fin)

TypeError: __new__() takes at least 3 arguments (1 given)

Specifically, note the line:

TypeError: __new__() takes at least 3 arguments (1 given)

1	TypeError: __new__() takes at least 3 arguments (1 given)

So far so good, so how do we fix it?

ARIMA Model Save Bug Workaround

Zae Myung Kim discovered this bug in September 2016 and reported the fault.

You can read all about it here:

BUG: Implemented __getnewargs__() method for unpickling

The bug occurs because a function required by pickle (the library used to serialize Python objects) has not been defined in statsmodels.

A function __getnewargs__ must be defined in the ARIMA model prior to saving that defines the arguments needed to construct the object.

We can work around this issue. The fix involves two things:

Defining an implementation of the __getnewargs__ function suitable for the ARIMA object.
Adding the new function to ARIMA.

Thankfully, Zae Myung Kim provided an example of the function in his bug report so we can just use that directly:

def __getnewargs__(self):
	return ((self.endog),(self.k_lags, self.k_diff, self.k_ma))

1 2	def __getnewargs__(self): return ((self.endog),(self.k_lags, self.k_diff, self.k_ma))

Python allows us to monkey patch an object, even one from a library like statsmodels.

We can define a new function on an existing object using assignment.

We can do this for the __getnewargs__ function on the ARIMA object as follows:

ARIMA.__getnewargs__ = __getnewargs__

1	ARIMA.__getnewargs__ = __getnewargs__

The complete example of training, saving, and loading an ARIMA model in Python with the monkey patch is listed below:

from pandas import read_csv
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.arima_model import ARIMAResults

# monkey patch around bug in ARIMA class
def __getnewargs__(self):
	return ((self.endog),(self.k_lags, self.k_diff, self.k_ma))
ARIMA.__getnewargs__ = __getnewargs__

# load data
series = read_csv('daily-total-female-births.csv', header=0, index_col=0))
# prepare data
X = series.values
X = X.astype('float32')
# fit model
model = ARIMA(X, order=(1,1,1))
model_fit = model.fit()
# save model
model_fit.save('model.pkl')
# load model
loaded = ARIMAResults.load('model.pkl')

from pandas import read_csv

from statsmodels.tsa.arima_model import ARIMA

from statsmodels.tsa.arima_model import ARIMAResults

# monkey patch around bug in ARIMA class

def __getnewargs__(self):

return ((self.endog),(self.k_lags, self.k_diff, self.k_ma))

ARIMA.__getnewargs__ = __getnewargs__

# load data

series = read_csv('daily-total-female-births.csv', header=0, index_col=0))

# prepare data

X = series.values

X = X.astype('float32')

# fit model

model = ARIMA(X, order=(1,1,1))

model_fit = model.fit()

# save model

model_fit.save('model.pkl')

# load model

loaded = ARIMAResults.load('model.pkl')

Running the example now successfully loads the model without error.

Summary

In this post, you discovered how to work around a bug in the statsmodels ARIMA implementation that prevented you from saving and loading an ARIMA model to and from file.

You discovered how to write a monkey patch to work around the bug and how to demonstrate that it has indeed been fixed.

Did you use this workaround on your project?
Let me know about it in the comments below.

30 Responses to How to Save an ARIMA Time Series Forecasting Model in Python

n1k31t4 June 3, 2017 at 11:09 pm #

Hi Jason, thanks for the workaround.
I am using statsmodels 0.8.0 and have been trying to save a dictionary of fitted models using pickle itself i.e. not using the built-in method. I am still getting the error you mention above about the ARIMA object not having a dates attribute. Why it is trying to look for a dates attribute anyway?
Can you think of a work around for this case?

Reply
- Jason Brownlee June 4, 2017 at 7:53 am #
  
  Did you try the workaround?
  
  Reply
Veeral June 10, 2017 at 6:56 am #

Hi Dr. Brownlee,
Thank you for sharing. I tried to adjust this workaround so that it can work for seasonal ARIMA modeling but I am still having issues. I would like to save the model into a pickle file but I keep receiving the following error: “Type error: can’t pickle statsmodels.tsa.statespace._statespace.dStatespace objects”

Do you have an idea of how I can go about working around this issue? Here is some of the code I am working with.

mod = sm.tsa.statespace.SARIMAX(
timeseries,
order=(1, 0, 1),
seasonal_order=(0, 1, 1, 28),
enforce_stationarity=False,
enforce_invertibility=False
)

self.trained_model = mod.fit()
self.trained_model.save(‘call model.p’)

Thanks!

Reply
- Jason Brownlee June 10, 2017 at 8:31 am #
  
  Sorry, I have not tried.
  
  Perhaps reach out to the developer that discovered the workaround, he may have some ideas about SARIMAX.
  
  Reply
- Ankit July 1, 2017 at 6:24 am #
  
  Hi Veeral,
  
  Were you able to find the fix for your issue? I am also facing the same issue with SARIMA model.
  
  Reply
- Jay B September 21, 2017 at 12:36 am #
  
  Hey everyone,
  I am also having the same problem with SARIMAX models; has anyone found a solution?
  
  Reply
Ankit Tripathi June 2, 2018 at 8:53 pm #

Hey Jason,
Thanks for the article.

I am using ARIMA to fit values and save it as a pickle file. Post that, the pickle file is used to get out of sample predictions. However, while getting sample predictions I am getting the following error: Cannot cast ufunc subtract output from dtype(‘float64’) to dtype(‘int64’) with casting rule ‘same_kind’. Do you have any idea about the cause?

Reply
- Jason Brownlee June 3, 2018 at 6:22 am #
  
  I have not seen this sorry.
  
  Perhaps try posting to stackoverflow?
  
  Reply
Ankit Tripathi June 13, 2018 at 5:22 pm #

Hey Jason, thanks for the awesome post! I followed this and saved a pickle file with a certain ARIMA order. The model was successfully fit and then saved as pickle file. However, when I load the pickle file to get out of sample forecasts, I get null values for any number of steps. Do you have any idea about this issue?

Reply
- Ankit Tripathi June 13, 2018 at 9:38 pm #
  
  So basically, after fitting the model, model_fit.summary() throws error. I know that its not a code error solving platform, but I think it is a very general case if model summary is throwing error after fitting the model!
  
  Reply
  - Jason Brownlee June 14, 2018 at 6:02 am #
    
    That is odd. Perhaps comment out that line?
    
    Reply
- Jason Brownlee June 14, 2018 at 5:58 am #
  
  Sorry, I have not seen this issue. Perhaps ensure that you are using the most recent version of Statsmodels?
  
  Reply
Ankit Tripathi June 14, 2018 at 6:36 pm #

Jason, the problem is that , after reading the pickle file, out of sample forecasts give out null values. I think , this is probably a bug, Will try to report it on Git. Thanks.

Reply
- Jason Brownlee June 15, 2018 at 6:44 am #
  
  That is a shape.
  
  I have some examples in the blog of using the coefficients directly, without the wrapper class. Perhaps that would be a good workaround?
  
  Reply
  - sanjie October 9, 2018 at 9:17 pm #
    
    hello Jason,
    thanks for your article, i use statsmodels 0.9.0, there is not this kind of bug.
    
    Reply
    - Jason Brownlee October 10, 2018 at 6:08 am #
      
      Glad to hear it!
      
      Reply
  - Archana Dwivedi August 30, 2020 at 2:31 am #
    
    Hello Jason,
    Do you have idea of azure ml pipeline.There is a module there “create python model” module. But I am not getting how will it work and how will the parameters be passed. Thanks in advance
    
    Reply
    - Jason Brownlee August 30, 2020 at 6:45 am #
      
      Sorry, I have never used the MS platform.
      
      Reply
karim May 22, 2019 at 3:52 am #

Hello, Thanks for your nice post. But I am confused about how to use the loaded model to predict the feature. For example, I am giving here the code(which is also may be taken from one of your tutorials as I am learning all of these stuff with your resources):

Actual = [x for x in train_set]
Predictions = list()

#Function that calls ARIMA model to fit and forecast the data
def StartARIMAForecasting(Actual, P, D, Q):
model = ARIMA(Actual, order=(P, D, Q))
model_fit = model.fit(disp=0)
prediction = model_fit.forecast()[0]
return prediction

for timepoint in range(len(test_set)):
ActualValue = test_set[timepoint]
#forcast value
Prediction = StartARIMAForecasting(Actual, 2,1,2)
print(‘Actual=%f, Predicted=%f’ % (ActualValue, Prediction))
#add it in the list
Predictions.append(Prediction)
Actual.append(ActualValue)

Now, here, Where I can write the line to save the model and after saving and loading what will be the correct line to do the prediction? I am eagerly waiting to get the solution. 🙂

N:B: In my dataset, I have used datetime as index.

Reply
- Jason Brownlee May 22, 2019 at 8:12 am #
  
  Good question.
  
  The model is saved after it is fit. It can the be loaded to make a prediction.
  
  Perhaps this will help:
  https://machinelearningmastery.com/make-sample-forecasts-arima-python/
  
  Reply
  - karim May 22, 2019 at 8:53 am #
    
    Thank you. I will check it now. Another question came in my mind. Is it possible to use ARIMA for multivariate and multistep forecasting? Till now I have only found ARIMA for Univariate Time series.
    
    Reply
    - Jason Brownlee May 22, 2019 at 2:31 pm #
      
      Yes, it is called VAR or VARIMA, see this post:
      https://machinelearningmastery.com/time-series-forecasting-methods-in-python-cheat-sheet/
      
      Reply
  - manon December 11, 2020 at 4:17 am #
    
    Hi
    
    It is not clear for me how to use the model once loaded. Neither after looking at the other post on https://machinelearningmastery.com/make-sample-forecasts-arima-python/.
    
    Can you make an example?
    
    Reply
    - Jason Brownlee December 11, 2020 at 6:41 am #
      
      Sure, what problem are you having exactly perhaps I can provide some tips.
      
      Reply
MK February 13, 2021 at 11:58 pm #

Hi, I have been trying your saved LSTM model code with my deployment for time series forecasting quite a long time. Now i m facing a problem which stuck me for a while, can you please share me some tips

below is my code:

model.py

from pandas import read_csv
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.arima_model import ARIMAResults
import warnings
warnings.filterwarnings(“ignore”)

# monkey patch around bug in ARIMA class
def __getnewargs__(self):
return ((self.endog),(self.k_lags, self.k_diff, self.k_ma))
ARIMA.__getnewargs__ = __getnewargs__

# load data
series = read_csv(‘Sales.csv’, header=0, index_col=0)
# prepare data
X = series.values
X = X.astype(‘float32’)
# fit model
model = ARIMA(X, order=(1,1,1))
model_fit = model.fit()
# save model
model_fit.save(‘model.pkl’)

app.py

from flask import Flask, make_response, request, render_template
import io
from io import StringIO
import csv
import pandas as pd
import numpy as np
import pickle
import os
from keras.models import load_model
from statsmodels.tsa.arima_model import ARIMAResults

app = Flask(__name__)

def transform(text_file_contents):
return text_file_contents.replace(“=”, “,”)

@app.route(‘/’)
def form():
return “””

Let’s TRY to Predict..

Insert your CSV file and then download the Result

Predict

“””
@app.route(‘/transform’, methods=[“POST”])
def transform_view():
if request.method == ‘POST’:
f = request.files[‘data_file’]
if not f:
return “No file”

stream = io.StringIO(f.stream.read().decode(“UTF8”), newline=None)
csv_input = csv.reader(stream)
#print(“file contents: “, file_contents)
#print(type(file_contents))
print(csv_input)
for row in csv_input:
print(row)

stream.seek(0)
result = transform(stream.read())

df = pd.read_csv(StringIO(result), usecols=[1])

# load the model from disk
model = ARIMAResults.load(‘model.pkl’)
dataset = df.values
dataset = dataset.astype(‘float32’)
dataset = np.reshape(dataset, (-1, 1))
df = model.predict(dataset)

response = make_response(df.to_csv())
response.headers[“Content-Disposition”] = “attachment; filename=result.csv”
return response

if __name__ == “__main__”:
app.run(debug=True, port = 9000, host = “localhost”)

Reply
- Jason Brownlee February 14, 2021 at 5:09 am #
  
  Sorry, I don’t have the capacity to review/debug your code, perhaps you can summarize the specific issue you are having?
  
  Reply
  - MK February 14, 2021 at 3:22 pm #
    
    KeyError: ‘The start argument could not be matched to a location related to the index of the data.’
    
    Reply
    - Jason Brownlee February 15, 2021 at 5:43 am #
      
      The error suggests that whatever you are specifying a range of data, such as to the predict function, that the index for the starting point of the range is not valid for your dataset.
      
      Perhaps you can print the length of your dataset and the start index your trying to use and compare them directly.
      
      Reply
JG January 20, 2022 at 7:15 pm #

Error encountered. Can you help on this.

—————————————————————————
ModuleNotFoundError Traceback (most recent call last)
in
3 from pandas import read_csv
4 from pandas import DataFrame
—-> 5 from statsmodels.tsa.arima.model import ARIMA
6 from matplotlib import pyplot
7 # load dataset

ModuleNotFoundError: No module named ‘statsmodels.tsa.arima’

Reply
- James Carmichael January 21, 2022 at 9:50 am #
  
  Hello JG…The following post will hopefully help you resolve the issue. Essentially you need to ensure that statsmodels is installed.
  
  https://stackoverflow.com/questions/69047074/no-module-named-statsmodels-tsa-arima-in-colab-but-not-in-pycharm
  
  Reply

Navigation

How to Save an ARIMA Time Series Forecasting Model in Python

Daily Female Births Dataset

Stop learning Time Series Forecasting the slow way!

Python Environment

ARIMA Model Save Bug

ARIMA Model Save Bug Workaround

Summary

Want to Develop Time Series Forecasts with Python?

Develop Your Own Forecasts in Minutes

Finally Bring Time Series Forecasting to
Your Own Projects

More On This Topic

30 Responses to How to Save an ARIMA Time Series Forecasting Model in Python

Leave a Reply Click here to cancel reply.

Navigation

Daily Female Births Dataset

Stop learning Time Series Forecasting the slow way!

Python Environment

ARIMA Model Save Bug

ARIMA Model Save Bug Workaround

Summary

Want to Develop Time Series Forecasts with Python?

Develop Your Own Forecasts in Minutes

Finally Bring Time Series Forecasting to Your Own Projects

More On This Topic

30 Responses to How to Save an ARIMA Time Series Forecasting Model in Python

Leave a Reply Click here to cancel reply.

Finally Bring Time Series Forecasting to
Your Own Projects