How to Reshape Input Data for Long Short-Term Memory Networks in Keras

It can be difficult to understand how to prepare your sequence data for input to an LSTM model.

Often there is confusion around how to define the input layer for the LSTM model.

There is also confusion about how to convert your sequence data that may be a 1D or 2D matrix of numbers to the required 3D format of the LSTM input layer.

In this tutorial, you will discover how to define the input layer to LSTM models and how to reshape your loaded input data for LSTM models.

After completing this tutorial, you will know:

  • How to define an LSTM input layer.
  • How to reshape a one-dimensional sequence data for an LSTM model and define the input layer.
  • How to reshape multiple parallel series data for an LSTM model and define the input layer.

Let’s get started.

How to Reshape Input for Long Short-Term Memory Networks in Keras

How to Reshape Input for Long Short-Term Memory Networks in Keras
Photo by Global Landscapes Forum, some rights reserved.

Tutorial Overview

This tutorial is divided into 4 parts; they are:

  1. LSTM Input Layer
  2. Example of LSTM with Single Input Sample
  3. Example of LSTM with Multiple Input Features
  4. Tips for LSTM Input

LSTM Input Layer

The LSTM input layer is specified by the “input_shape” argument on the first hidden layer of the network.

This can make things confusing for beginners.

For example, below is an example of a network with one hidden LSTM layer and one Dense output layer.

In this example, the LSTM() layer must specify the shape of the input.

The input to every LSTM layer must be three-dimensional.

The three dimensions of this input are:

  • Samples. One sequence is one sample. A batch is comprised of one or more samples.
  • Time Steps. One time step is one point of observation in the sample.
  • Features. One feature is one observation at a time step.

This means that the input layer expects a 3D array of data when fitting the model and when making predictions, even if specific dimensions of the array contain a single value, e.g. one sample or one feature.

When defining the input layer of your LSTM network, the network assumes you have 1 or more samples and requires that you specify the number of time steps and the number of features. You can do this by specifying a tuple to the “input_shape” argument.

For example, the model below defines an input layer that expects 1 or more samples, 50 time steps, and 2 features.

Now that we know how to define an LSTM input layer and the expectations of 3D inputs, let’s look at some examples of how we can prepare our data for the LSTM.

Example of LSTM With Single Input Sample

Consider the case where you have one sequence of multiple time steps and one feature.

For example, this could be a sequence of 10 values:

We can define this sequence of numbers as a NumPy array.

We can then use the reshape() function on the NumPy array to reshape this one-dimensional array into a three-dimensional array with 1 sample, 10 time steps, and 1 feature at each time step.

The reshape() function when called on an array takes one argument which is a tuple defining the new shape of the array. We cannot pass in any tuple of numbers; the reshape must evenly reorganize the data in the array.

Once reshaped, we can print the new shape of the array.

Putting all of this together, the complete example is listed below.

Running the example prints the new 3D shape of the single sample.

This data is now ready to be used as input (X) to the LSTM with an input_shape of (10, 1).

Example of LSTM with Multiple Input Features

Consider the case where you have multiple parallel series as input for your model.

For example, this could be two parallel series of 10 values:

We can define these data as a matrix of 2 columns with 10 rows:

This data can be framed as 1 sample with 10 time steps and 2 features.

It can be reshaped as a 3D array as follows:

Putting all of this together, the complete example is listed below.

Running the example prints the new 3D shape of the single sample.

This data is now ready to be used as input (X) to the LSTM with an input_shape of (10, 2).

Tips for LSTM Input

This section lists some tips to help you when preparing your input data for LSTMs.

  • The LSTM input layer must be 3D.
  • The meaning of the 3 input dimensions are: samples, time steps, and features.
  • The LSTM input layer is defined by the input_shape argument on the first hidden layer.
  • The input_shape argument takes a tuple of two values that define the number of time steps and features.
  • The number of samples is assumed to be 1 or more.
  • The reshape() function on NumPy arrays can be used to reshape your 1D or 2D data to be 3D.
  • The reshape() function takes a tuple as an argument that defines the new shape.

Further Reading

This section provides more resources on the topic if you are looking go deeper.

Summary

In this tutorial, you discovered how to define the input layer for LSTMs and how to reshape your sequence data for input to LSTMs.

Specifically, you learned:

  • How to define an LSTM input layer.
  • How to reshape a one-dimensional sequence data for an LSTM model and define the input layer.
  • How to reshape multiple parallel series data for an LSTM model and define the input layer.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Develop LSTMs for Sequence Prediction Today!

Long Short-Term Memory Networks with Python

Develop Your Own LSTM models in Minutes

…with just a few lines of python code

Discover how in my new Ebook:
Long Short-Term Memory Networks with Python

It provides self-study tutorials on topics like:
CNN LSTMs, Encoder-Decoder LSTMs, generative models, data preparation, making predictions and much more…

Finally Bring LSTM Recurrent Neural Networks to
Your Sequence Predictions Projects

Skip the Academics. Just Results.

Click to learn more.


10 Responses to How to Reshape Input Data for Long Short-Term Memory Networks in Keras

  1. Steven August 31, 2017 at 2:14 am #

    Great explanation of the dimensions! Just wanted to say this explanation also works for LSTM models in Tensorflow as well.

    • Jason Brownlee August 31, 2017 at 6:20 am #

      Thanks Steven.

      • yuan September 1, 2017 at 6:42 pm #

        Hi Jason,

        Thanks a lot for your explanations .
        I have a confusion below:
        Assuming that we have multiple parallel series as input for out model.The first step is to define these data as a matrix of M columns with N rows.To be 3D(samples, time steps, and features),is this means that,samples :1 sample ,time steps: row numbers of the matrix ,and features: column numbers of the matrix ? Must it be like this?Looking forward to your reply.Thank you

        • Jason Brownlee September 2, 2017 at 6:06 am #

          Sorry, I’m not sure I follow your question.

          If you have parallel time series, then each series would need the same number of time steps and be represented as a separate feature (e.g. observation at a time).

          Does that help?

  2. Oliver August 31, 2017 at 9:23 pm #

    Hi Jason,

    thanks a lot for all the explanations you gave!
    I tried to understand the effect of the reshape parameters and the effect in the spyder/variable explorer. But I do not understand the result shown in the data window.
    I used the code from a different tutorial:

    data = array([
    [0.1, 1.0],
    [0.2, 0.9],
    [0.3, 0.8],
    [0.4, 0.7],
    [0.5, 0.6],
    [0.6, 0.5],
    [0.7, 0.4],
    [0.8, 0.3],
    [0.9, 0.2],
    [1.0, 0.1]])
    data_re = data.reshape(1, 10, 2)

    When checking the result in the variable explorer of spyder I see 3 dimensions of the array but can not connect it to the paramters sample, timestep, feature.

    On axis 0 of data_re I see the complete dataset
    On axis 1 of the data_re I get 0.1 and 1.0 in column 1
    On axis 2 of the data_re I see the column 1 of axis 0 transposed to row 1

    Would you give me a hint how to interpret it?

    Regards,
    Oliver.

    • Jason Brownlee September 1, 2017 at 6:46 am #

      There are no named parameters, I am referring to the dimensions by those names because that is how the LSTM model uses the data.

      Sorry for the confusion.

  3. Saga September 1, 2017 at 6:46 pm #

    Hi Jason,

    Thanks so much for the article (and the whole series in fact!). The documentation in Keras is not very clear on many things on its own.

    I have been trying to implement a model that receives multiple samples of multivariate timeseries as input. The twist is that the length of the series, i.e. the “time steps” dimension is different for different samples. I have tried to train a model on each sample individually and then merge, (but then each LSTM is going to be extremely prone to overfitting). Another idea was to scale the samples to have the same time steps but this comes with a scaling factor of time steps for each sample which is not ideal either.

    Is there a way to provide the LSTM with samples of dynamic time steps? maybe using a lower-level API?

    Regards,
    Saga

    • Jason Brownlee September 2, 2017 at 6:07 am #

      A way I use often is to pad all sequences to the same length and use a masking layer on the front end to ignore masked time steps.

  4. Shrimanti September 14, 2017 at 2:42 am #

    Hi Jason,

    Thanks very much for your tutorials on LSTM. I am trying to predict one time series from 10 different parallel time series. All of them are different 1D series. So, the shape of my X_train is (50000,10) and Y_train is (50000,1). I couldn’t figure out how to reshape my dataset and the input shape of LSTM if I want to use let’s say 100 time steps or look back as 100.

    Thanks.

Leave a Reply