How to Make Predictions with Long Short-Term Memory Models in Keras

The goal of developing an LSTM model is a final model that you can use on your sequence prediction problem.

In this post, you will discover how to finalize your model and use it to make predictions on new data.

After completing this post, you will know:

  • How to train a final LSTM model.
  • How to save your final LSTM model, and later load it again.
  • How to make predictions on new data.

Let’s get started.

How to Make Predictions with Long Short-Term Memory Models with Keras

How to Make Predictions with Long Short-Term Memory Models with Keras
Photo by damon jah, some rights reserved.

Step 1. Train a Final Model

What Is a Final LSTM Model?

A final LSTM model is one that you use to make predictions on new data.

That is, given new examples of input data, you want to use the model to predict the expected output. This may be a classification (assign a label) or a regression (a real value).

The goal of your sequence prediction project is to arrive at a final model that performs the best, where “best” is defined by:

  • Data: the historical data that you have available.
  • Time: the time you have to spend on the project.
  • Procedure: the data preparation steps, algorithm or algorithms, and the chosen algorithm configurations.

In your project, you gather the data, spend the time you have, and discover the data preparation procedures, algorithm to use, and how to configure it.

The final model is the pinnacle of this process, the end you seek in order to start actually making predictions.

There is no such thing as a perfect model. There is only the best model that you were able to discover.

How to Finalize an LSTM Model?

You finalize a model by applying the chosen LSTM architecture and configuration on all of your data.

There is no train and test split and no cross-validation folds. Put all of the data back together into one large training dataset and fit your model.

That’s it.

With the finalized model, you can:

  • Save the model for later or operational use.
  • Load the model and make predictions on new data.

For more on training a final model, see the post:

Need help with LSTMs for Sequence Prediction?

Take my free 7-day email course and discover 6 different LSTM architectures (with code).

Click to sign-up and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

Step 2. Save Your Final Model

Keras provides an API to allow you to save your model to file.

The model is saved in HDF5 file format that efficiently stores large arrays of numbers on disk. You will need to confirm that you have the h5py Python library installed. It can be installed as follows:

You can save a fit Keras model to file using the save() function on the model.

For example:

This single file will contain the model architecture and weights. It also includes the specification of the chosen loss and optimization algorithm so that you can resume training.

The model can be loaded again (from a different script in a different Python session) using the load_model() function.

Below is a complete example of fitting an LSTM model, saving it to a single file and later loading it again. Although the loading of the model is in the same script, this section may be run from another script in another Python session. Running the example saves the model to the file lstm_model.h5.

For more on saving and loading your Keras model, see the post:

Step 3. Make Predictions on New Data

After you have finalized your model and saved it to file, you can load it and use it to make predictions.

For example:

  • On a sequence regression problem, this may be the prediction of the real value at the next time step.
  • On a sequence classification problem, this may be a class outcome for a given input sequence.

Or it may be any other variation based on the specifics of your sequence prediction problem. You would like an outcome from your model (yhat) given an input sequence (X) where the true outcome for the sequence (y) is currently unknown.

You may be interested in making predictions in a production environment, as the backend to an interface, or manually. It really depends on the goals of your project.

Any data preparation performed on your training data prior to fitting your final model must also be applied to any new data prior to making predictions.

Predicting is the easy part.

It involves taking the prepared input data (X) and calling one of the Keras prediction methods on the loaded model.

Remember that the input for making a prediction (X) is only comprised of the input sequence data required to make a prediction, not all prior training data. In the case of predicting the next value in one sequence, the input sequence would be 1 sample with the fixed number of time steps and features used when you defined and fit your model.

For example, a raw prediction in the shape and scale of the activation function of the output layer can be made by calling the predict() function on the model:

The prediction of a class index can be made by calling the predict_classes() function on the model.

The prediction of probabilities can be made by calling the predict_proba() function on the model.

For more on the life-cycle of your Keras model, see the post:

Further Reading

This section provides more resources on the topic if you are looking go deeper.




In this post, you discovered how to finalize your model and use it to make predictions on new data.

Specifically, you learned:

  • How to train a final LSTM model.
  • How to save your final LSTM model, and later load it again.
  • How to make predictions on new data.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Develop LSTMs for Sequence Prediction Today!

Long Short-Term Memory Networks with Python

Develop Your Own LSTM models in Minutes

…with just a few lines of python code

Discover how in my new Ebook:
Long Short-Term Memory Networks with Python

It provides self-study tutorials on topics like:
CNN LSTMs, Encoder-Decoder LSTMs, generative models, data preparation, making predictions and much more…

Finally Bring LSTM Recurrent Neural Networks to
Your Sequence Predictions Projects

Skip the Academics. Just Results.

Click to learn more.

52 Responses to How to Make Predictions with Long Short-Term Memory Models in Keras

  1. Klaas Brau August 30, 2017 at 2:03 am #

    Thanks Jason

    One question. Why should we finalize the model on the whole data. We would change the weights again right? The model which was trained on the training data is the one we tested on unseen data (test set). The new model (trained on all data) could be worse, overfitted,… not?

  2. ketan September 14, 2017 at 7:02 pm #

    I tried both keras and tensorflow. Tensorflow has more features.

  3. tieliu October 13, 2017 at 6:53 pm #

    thanks, Jason.

    I ran your sample code, but found the result as below, which seems not expected.

    … …
    Epoch 290/300
    6/6 [==============================] – 0s – loss: 0.0155
    Epoch 295/300
    6/6 [==============================] – 0s – loss: 0.0153
    Epoch 296/300
    6/6 [==============================] – 0s – loss: 0.0153
    Epoch 297/300
    6/6 [==============================] – 0s – loss: 0.0152
    Epoch 298/300
    6/6 [==============================] – 0s – loss: 0.0152
    Epoch 299/300
    6/6 [==============================] – 0s – loss: 0.0152
    Epoch 300/300
    6/6 [==============================] – 0s – loss: 0.0151
    [[ 0.28978038]
    [ 0.31878966]
    [ 0.3477335 ]
    [ 0.37631655]
    [ 0.4042924 ]
    [ 0.43146992]]

    Suppose input of X are 0 0.1 0.2 0.3 0.4 0.5, then predict value of y should be similar to 0.1 0.2… 0.6. but the result is such value as 0.28978038… … 0.43146992.

    can you check more about it?

    • Jason Brownlee October 14, 2017 at 5:43 am #

      What is the problem exactly?

      • Kingsley Udeh June 29, 2018 at 10:31 am #

        Hi Dr. Jason,

        Thanks so much for your tutorials.

        I will like to clarify what Tieliu’s question was:

        He was making a reference to the example cited on this page where you demonstrated making predictions with LSTM as follows:

        # return training data
        def get_train():
        seq = [[0.0, 0.1], [0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]
        seq = array(seq)
        X, y = seq[:, 0], seq[:, 1]
        X = X.reshape((len(X), 1, 1))
        return X, y

        # define model
        model = Sequential()
        model.add(LSTM(10, input_shape=(1,1)))
        model.add(Dense(1, activation=’linear’))
        # compile model
        model.compile(loss=’mse’, optimizer=’adam’)
        # fit model
        X,y = get_train(), y, epochs=300, shuffle=False, verbose=0)
        # save model to single file‘lstm_model.h5’)

        # snip…
        # later, perhaps run from another script

        # load model from single file
        model = load_model(‘lstm_model.h5’)
        # make predictions
        yhat = model.predict(X, verbose=0)

        When you run the code, the trained model did not make good prediction of the actual response variable y. It has the following predicted yhat values:


        Rather than the actual y values:


        In other words, if we approximate the predicted values to 2 decimal places, the model predicted 0.2, 0.3, and 0.4, correctly, but failed in predicting 0.1 and 0.5. In this situation, could we say our final model should be discarded and we then decide to train another model with a different set of procedures and configurations?

        Sorry, this question is a bit longer than expected, but I wanted to clarify the initial question as there was no further conversation on this.

        • Jason Brownlee June 29, 2018 at 3:26 pm #

          Sorry, I don’t follow, why would we discard the model?

          • Kingsley Udeh June 29, 2018 at 7:25 pm #

            Because not all the predicted values are equal to the actual values as shown between yhat and y, or am I missing some important concept of your tutorial?

          • Jason Brownlee June 30, 2018 at 6:06 am #

            No model is perfect. If perfection was possible we would not need machine learning.

          • Kingsley Udeh June 30, 2018 at 11:12 am #

            Got it! Thanks so much.

            BWT:How do we show or tell that a model is good? Do we just only care about the scores, for example, a regression problem?

          • Jason Brownlee July 1, 2018 at 6:22 am #


  4. Fawad October 16, 2017 at 9:27 pm #

    Hi, I want to predict for a whole record of shape like (160, 72) for single time step. How would I shape my numpy array of features for test. For more clear understanding, I have trained my model on trainX with shape (235, 1, 72) and trainY with shape (235,). Now I want to predict a single timestep but for 160 rows. How to do that?

  5. joseph February 26, 2018 at 2:59 pm #

    Hi Jason,

    When performing model.predict, i do get some inconsistent outputs. Does that mean my model is wrong?as far as i know, possible scenario for inconsistent outputs is if i try to re-fit the model and not during prediction. Am i missing something? hope to get some comments from you. thank you very much

  6. Maryam March 18, 2018 at 8:36 am #

    Hi Jason,
    I am so grateful for the post U shared with us.
    I just want to load 3 finalized model namely RNN, CNN, LSTM in one script concurrently which have already saved as a finalized model in keras to using them in an ensemble model to gain an average result for predicting. Is it necessary to use dask data frame to load multi finalized (saved model) models?? or loading multi finalized model has the same commands as loading one finalized model?
    Thank U in advanced for taking your time to replying.

    • Jason Brownlee March 19, 2018 at 6:02 am #

      A Pandas DataFrame is not required. Each model can be saved and loaded to and from separate files and used in an ensemble.

  7. Delaram April 2, 2018 at 7:25 am #

    Hi Jason,
    I am really appreciated about this helpful tutorial.
    I train and then save a finalized cnn model in a script after that I load the finalized cnn model in another different script just used this command :(load_model(‘cnn_model.h5’) .
    In fact I have a test dataset which does not have any label and I wanna gain the proability of belonging of each sample to each class by this command : (model_cnn_final.predict_proba) but gave me this error:((AttributeError: ‘Model’ object has no attribute ‘predict_proba’) and also when I applied this command:[yhat=model.predict_classes(X) ] it gave me this error :(‘Model’ object has no attribute ‘predict_classes’).
    I have used the command :(yhat = model.predict(X)) and it worked fine.

    what is the problem with these commands which cuase error??
    how can I fixe the errors?

    • Jason Brownlee April 2, 2018 at 2:47 pm #

      I believe these methods are only supported on Sequential models, you may be using the functional Model API. In that case, you may be limited to the predict() function alone, which will return probabilities in the case of a softmax activation function in the output layer.

  8. Fredrik Nilsson April 4, 2018 at 6:27 am #


    This is super good Jason!

    Thanks your writings 🙂

  9. Bastien April 19, 2018 at 2:24 am #

    Hi Jason,

    Thank you for this really helpful tutorial.

    I have a question. I think I am missing something to make predictions. I don’t understand what should be the input on model.predict(X) to predict new data. Let’s say I have one year of data (sampled every hour) and I want to predict the following week. What should my X be ?

  10. ata May 2, 2018 at 7:26 am #

    Hello, it is realy nice explanation, but I want to ask what verbose does ? why we assign it to 0 ?

    • Jason Brownlee May 3, 2018 at 6:27 am #

      Verbose gives output. We can turn off this output by setting it to 0.

  11. Francisco June 6, 2018 at 11:46 pm #

    Hi Jason, thank you for your tutorial. I’m very grateful for what I have learnt from you.

    I have a question. Let’s say I have my LSTM model and it is working properly with the train and test data, so the model is ready to be used in production. The data that I have has the following features: timestamp, price and capitalization. If I want to predict tomorrow’s price, must
    I provide timestamp and capitalization values? or only timestamp?

    • Jason Brownlee June 7, 2018 at 6:29 am #

      To make one prediction, it must be provided with one input sequence, as defined by your model during training.

  12. SM June 12, 2018 at 12:42 am #

    Hi Jason, all your blogposts are super insightful. Cannot wait to read more articles.

    Earlier, I had not preprocessed X and y correctly. Now I have used the “Stacked LSTM for sequence classification” referenced in keras homepage

    This is how my result looks. Unfortunately, I do not have ytest.txt to verify my results. Please have a look and let me know if you have any comments.

    My number of classes to predict is 12, however the predicted output range in classes was 1 to 10. Not sure why and if it incorrect.

    Thanks a lot again.

  13. SM June 12, 2018 at 12:48 am #

    Hi Jason, one question I have : “validation schemes are supposed to be used to estimate quality of the model. When you found the right hyper-parameters and want to get test predictions don’t forget to retrain your model using all training data.”

    Once the model is trained with validation set, should I retrain the model using all training data?

    Thank you.

  14. Hamied June 19, 2018 at 11:17 pm #

    Hi Jason,
    I have a concern related RealTime validation if you have input test data (60 frame per ms ) and you would like to do prediction in Realtime . How you could ensure the prediction will be done. ?

    On the other hand, lets give an example that, we ‘re getting input test data ( 100 x 162 ) , time stamp x features . We are fitting all the information in one sample array (1,100,162,1), then you would like to do prediction for each time instance when you receive the data. The problem the streaming dataset is too fast to be catch by model:
    y_predict = model.predict_classes(test_input)
    I would like to know if you have any suggestions regarfing such a problem. In term of how we could make the prediction possible in realtime streaming data for each time the input change only. ?

    If you run it with this speed ( the prediction will be going on frem old dataset . Cant catch the new samples )

    Thanks in advance

    • Jason Brownlee June 20, 2018 at 6:27 am #

      Making predictions is very fast.

      Only training the model is slow, which is only done once before it is used.

  15. Maryam June 23, 2018 at 6:25 am #

    Hi Jason,
    Thank you for the awesome and also practical tutorial as ever been.
    I face a question as that is if I wanna use predict(x_dataset) function, is it necessary to padding “x_dataset” or not??
    I will be grateful if you answer the question.
    Best Regard

    • Jason Brownlee June 24, 2018 at 7:25 am #

      To make a prediction, the input data must be prepared in the same way as the training data, including lengths and transforms.

  16. Matteo August 9, 2018 at 10:07 pm #

    I have one question.
    I have a training set with the labels, let’s say that I play cheess and I have the historical matches with label of the winner [0 = player1, 1 = player 2]
    And I want to predict if after 10/15 moves I’ll have more probability to win or lose.
    How can I write the model that predict a number between 0 and 1 ( close to 0 means that I’ll win and close to 1 means i’ll lose )
    Thanks !

    • Jason Brownlee August 10, 2018 at 6:14 am #

      A good approach might be to use rating systems to estimate the skill of each player and feed this into a predictive model.

  17. Noe August 24, 2018 at 7:35 pm #

    Hello Jason, and many thanks for this awesome tutorial.

    My preoccupation is about using the trained and tested model to predict the future. This means values after the test set.

    Thank you.

  18. Alireza September 27, 2018 at 10:06 pm #

    Hi Jason,
    I tried to make predictions just for one Row input data. But I like to Know should my new data be scaled or not? if yes I tried with my scaled model but I got “0” for each feature!! which way is correct? please help me.

    Thank you.

    • Jason Brownlee September 28, 2018 at 6:15 am #

      Your input data must be prepared in the same way as the training data.

      If the training data was scaled, then new data must be scaled using the same coefficients.

  19. bedorlan October 11, 2018 at 1:48 am #

    Finally an easy to understand gist on how to implement an LSTM. Thank you!

  20. Saurabh Swaroop October 11, 2018 at 10:59 am #

    Hello Jason,

    I tried 6.7 code example from Long Short Term Memory Networks with Python. But its giving error.

    from random import randint
    from numpy import array
    from numpy import argmax
    from keras.models import Sequential
    from keras.layers import LSTM
    from keras.layers import Dense

    # generate a sequence of random numbers in [0, n_features)
    def generate_sequence(length, n_features):
    return [randint(0, n_features-1) for _ in range(length)]

    # one hot encode sequence
    def one_hot_encode(sequence, n_features):
    encoding = list()
    for value in sequence:
    vector = [0 for _ in range(n_features)]
    vector[value] = 1
    return array(encoding)

    # decode a one hot encoded string
    def one_hot_decode(encoded_seq):
    return [argmax(vector) for vector in encoded_seq]

    # generate one example for an lstm
    def generate_example(length, n_features, out_index):
    # generate sequence
    sequence = generate_sequence(length, n_features)
    # one hot encode
    encoded = one_hot_encode(sequence, n_features)
    print(“Shape of encoded is:”, encoded.shape)
    # reshape sequence to be 3D
    X = encoded.reshape((1, length, n_features))
    # select output
    y = encoded[out_index].reshape(1, n_features)
    return X, y

    # define model
    length = 5
    n_features = 10
    out_index = 2
    model = Sequential()
    model.add(LSTM(25, input_shape=(length, n_features)))
    model.add(Dense(n_features, activation=’softmax’))
    model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘acc’])

    # fit model
    for i in range(10000):
    X, y = generate_example(length, n_features, out_index), y, epochs=1, verbose=2)

    Shape of encoded is: (1, 10)
    ValueError Traceback (most recent call last)
    in ()
    1 # fit model
    2 for i in range(10000):
    —-> 3 X, y = generate_example(length, n_features, out_index)
    4, y, epochs=1, verbose=2)

    in generate_example(length, n_features, out_index)
    7 print(“Shape of encoded is:”, encoded.shape)
    8 # reshape sequence to be 3D
    —-> 9 X = encoded.reshape((1, length, n_features))
    10 # select output
    11 y = encoded[out_index].reshape(1, n_features)

    ValueError: cannot reshape array of size 10 into shape (1,5,10)

    • Jason Brownlee October 11, 2018 at 4:13 pm #

      It suggest that the shape of your data does not match the expectation of your model.

      You can change the shape of the data or change the expectation of the model.

  21. Shooter November 1, 2018 at 2:13 pm #

    Hello Jason,
    Thanks for the great tutorial. I just wanted to know how can i calculate computational complexity of LSTM?

    Thanks in advance.

    • Jason Brownlee November 1, 2018 at 2:34 pm #

      Sorry, I don’t have material on calculating the computational complexity of neural networks.

Leave a Reply