The goal of developing an LSTM model is a final model that you can use on your sequence prediction problem.

In this post, you will discover how to finalize your model and use it to make predictions on new data.

After completing this post, you will know:

- How to train a final LSTM model.
- How to save your final LSTM model, and later load it again.
- How to make predictions on new data.

Let’s get started.

## Step 1. Train a Final Model

### What Is a Final LSTM Model?

A final LSTM model is one that you use to make predictions on new data.

That is, given new examples of input data, you want to use the model to predict the expected output. This may be a classification (assign a label) or a regression (a real value).

The goal of your sequence prediction project is to arrive at a final model that performs the best, where “best” is defined by:

**Data**: the historical data that you have available.**Time**: the time you have to spend on the project.**Procedure**: the data preparation steps, algorithm or algorithms, and the chosen algorithm configurations.

In your project, you gather the data, spend the time you have, and discover the data preparation procedures, algorithm to use, and how to configure it.

The final model is the pinnacle of this process, the end you seek in order to start actually making predictions.

There is no such thing as a perfect model. There is only the best model that you were able to discover.

### How to Finalize an LSTM Model?

You finalize a model by applying the chosen LSTM architecture and configuration on all of your data.

There is no train and test split and no cross-validation folds. Put all of the data back together into one large training dataset and fit your model.

That’s it.

With the finalized model, you can:

- Save the model for later or operational use.
- Load the model and make predictions on new data.

For more on training a final model, see the post:

### Need help with LSTMs for Sequence Prediction?

Take my free 7-day email course and discover 6 different LSTM architectures (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

## Step 2. Save Your Final Model

Keras provides an API to allow you to save your model to file.

The model is saved in HDF5 file format that efficiently stores large arrays of numbers on disk. You will need to confirm that you have the h5py Python library installed. It can be installed as follows:

1 |
sudo pip install h5py |

You can save a fit Keras model to file using the save() function on the model.

For example:

1 2 3 4 5 6 7 8 9 |
# define model model = Sequential() model.add(LSTM(...)) # compile model model.compile(...) # fit model model.fit(...) # save model to single file model.save('lstm_model.h5') |

This single file will contain the model architecture and weights. It also includes the specification of the chosen loss and optimization algorithm so that you can resume training.

The model can be loaded again (from a different script in a different Python session) using the load_model() function.

1 2 3 4 5 6 |
from keras.models import load_model # load model from single file model = load_model('lstm_model.h5') # make predictions yhat = model.predict(X, verbose=0) print(yhat) |

Below is a complete example of fitting an LSTM model, saving it to a single file and later loading it again. Although the loading of the model is in the same script, this section may be run from another script in another Python session. Running the example saves the model to the file lstm_model.h5.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from numpy import array from keras.models import load_model # return training data def get_train(): seq = [[0.0, 0.1], [0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]] seq = array(seq) X, y = seq[:, 0], seq[:, 1] X = X.reshape((len(X), 1, 1)) return X, y # define model model = Sequential() model.add(LSTM(10, input_shape=(1,1))) model.add(Dense(1, activation='linear')) # compile model model.compile(loss='mse', optimizer='adam') # fit model X,y = get_train() model.fit(X, y, epochs=300, shuffle=False, verbose=0) # save model to single file model.save('lstm_model.h5') # snip... # later, perhaps run from another script # load model from single file model = load_model('lstm_model.h5') # make predictions yhat = model.predict(X, verbose=0) print(yhat) |

For more on saving and loading your Keras model, see the post:

## Step 3. Make Predictions on New Data

After you have finalized your model and saved it to file, you can load it and use it to make predictions.

For example:

- On a sequence regression problem, this may be the prediction of the real value at the next time step.
- On a sequence classification problem, this may be a class outcome for a given input sequence.

Or it may be any other variation based on the specifics of your sequence prediction problem. You would like an outcome from your model (yhat) given an input sequence (X) where the true outcome for the sequence (y) is currently unknown.

You may be interested in making predictions in a production environment, as the backend to an interface, or manually. It really depends on the goals of your project.

Any data preparation performed on your training data prior to fitting your final model must also be applied to any new data prior to making predictions.

Predicting is the easy part.

It involves taking the prepared input data (X) and calling one of the Keras prediction methods on the loaded model.

Remember that the input for making a prediction (X) is only comprised of the input sequence data required to make a prediction, not all prior training data. In the case of predicting the next value in one sequence, the input sequence would be 1 sample with the fixed number of time steps and features used when you defined and fit your model.

For example, a raw prediction in the shape and scale of the activation function of the output layer can be made by calling the predict() function on the model:

1 2 3 |
X = ... model = ... yhat = model.predict(X) |

The prediction of a class index can be made by calling the predict_classes() function on the model.

1 2 3 |
X = ... model = ... yhat = model.predict_classes(X) |

The prediction of probabilities can be made by calling the predict_proba() function on the model.

1 2 3 |
X = ... model = ... yhat = model.predict_proba(X) |

For more on the life-cycle of your Keras model, see the post:

## Further Reading

This section provides more resources on the topic if you are looking go deeper.

### Posts

- How to Train a Final Machine Learning Model
- Save and Load Your Keras Deep Learning Models
- The 5 Step Life-Cycle for Long Short-Term Memory Models in Keras

### API

## Summary

In this post, you discovered how to finalize your model and use it to make predictions on new data.

Specifically, you learned:

- How to train a final LSTM model.
- How to save your final LSTM model, and later load it again.
- How to make predictions on new data.

Do you have any questions?

Ask your questions in the comments below and I will do my best to answer.

Thanks Jason

One question. Why should we finalize the model on the whole data. We would change the weights again right? The model which was trained on the training data is the one we tested on unseen data (test set). The new model (trained on all data) could be worse, overfitted,… not?

Learn more about how and why to finalize models here:

https://machinelearningmastery.com/train-final-machine-learning-model/

I address this exact concern.

I tried both keras and tensorflow. Tensorflow has more features.

It does, but it is much harder to use.

thanks, Jason.

I ran your sample code, but found the result as below, which seems not expected.

… …

Epoch 290/300

6/6 [==============================] – 0s – loss: 0.0155

Epoch 295/300

6/6 [==============================] – 0s – loss: 0.0153

Epoch 296/300

6/6 [==============================] – 0s – loss: 0.0153

Epoch 297/300

6/6 [==============================] – 0s – loss: 0.0152

Epoch 298/300

6/6 [==============================] – 0s – loss: 0.0152

Epoch 299/300

6/6 [==============================] – 0s – loss: 0.0152

Epoch 300/300

6/6 [==============================] – 0s – loss: 0.0151

[[ 0.28978038]

[ 0.31878966]

[ 0.3477335 ]

[ 0.37631655]

[ 0.4042924 ]

[ 0.43146992]]

Suppose input of X are 0 0.1 0.2 0.3 0.4 0.5, then predict value of y should be similar to 0.1 0.2… 0.6. but the result is such value as 0.28978038… … 0.43146992.

can you check more about it?

What is the problem exactly?

Hi, I want to predict for a whole record of shape like (160, 72) for single time step. How would I shape my numpy array of features for test. For more clear understanding, I have trained my model on trainX with shape (235, 1, 72) and trainY with shape (235,). Now I want to predict a single timestep but for 160 rows. How to do that?

See this post on how to reshape data for LSTMs:

https://machinelearningmastery.com/reshape-input-data-long-short-term-memory-networks-keras/

Hi Jason,

When performing model.predict, i do get some inconsistent outputs. Does that mean my model is wrong?as far as i know, possible scenario for inconsistent outputs is if i try to re-fit the model and not during prediction. Am i missing something? hope to get some comments from you. thank you very much

from numpy.random import seed

seed(1)

from tensorflow import set_random_seed

set_random_seed(2)

i have also set the seed before running them

It looks like you’re doing all the right things.

LSTMs are a stochastic algorithm, this post will shed light on this issue:

https://machinelearningmastery.com/randomness-in-machine-learning/

This post has advice on how to get reproducible results with Keras:

https://machinelearningmastery.com/reproducible-results-neural-networks-keras/

Thank you so much for the advise.. i will think about it more..

Hi Jason,

I am so grateful for the post U shared with us.

I just want to load 3 finalized model namely RNN, CNN, LSTM in one script concurrently which have already saved as a finalized model in keras to using them in an ensemble model to gain an average result for predicting. Is it necessary to use dask data frame to load multi finalized (saved model) models?? or loading multi finalized model has the same commands as loading one finalized model?

Thank U in advanced for taking your time to replying.

A Pandas DataFrame is not required. Each model can be saved and loaded to and from separate files and used in an ensemble.

Hi Jason,

I am really appreciated about this helpful tutorial.

I train and then save a finalized cnn model in a script after that I load the finalized cnn model in another different script just used this command :(load_model(‘cnn_model.h5’) .

In fact I have a test dataset which does not have any label and I wanna gain the proability of belonging of each sample to each class by this command : (model_cnn_final.predict_proba) but gave me this error:((AttributeError: ‘Model’ object has no attribute ‘predict_proba’) and also when I applied this command:[yhat=model.predict_classes(X) ] it gave me this error :(‘Model’ object has no attribute ‘predict_classes’).

I have used the command :(yhat = model.predict(X)) and it worked fine.

what is the problem with these commands which cuase error??

how can I fixe the errors?

I believe these methods are only supported on Sequential models, you may be using the functional Model API. In that case, you may be limited to the predict() function alone, which will return probabilities in the case of a softmax activation function in the output layer.

Hi

This is super good Jason!

Thanks your writings 🙂

I’m glad it helps.

Hi Jason,

Thank you for this really helpful tutorial.

I have a question. I think I am missing something to make predictions. I don’t understand what should be the input on model.predict(X) to predict new data. Let’s say I have one year of data (sampled every hour) and I want to predict the following week. What should my X be ?

It depends on how you define your model.

If the model expects 1 year of data to make a 1 week forecast, then you must provide 1 year of data as X.

Does that help?

It does but I still don’t get it. I am following your tutorial on Mutlivariate Time Series Forecasting (https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/) and I can’t find where you are defining these parameters of the model.