Making Predictions with Sequences

Sequence prediction is different from other types of supervised learning problems.

The sequence imposes an order on the observations that must be preserved when training models and making predictions.

Generally, prediction problems that involve sequence data are referred to as sequence prediction problems, although there are a suite of problems that differ based on the input and output sequences.

In this tutorial, you will discover the different types of sequence prediction problems.

After completing this tutorial, you will know:

  • The 4 types of sequence prediction problems.
  • Definitions for each type of sequence prediction problem by the experts.
  • Real-world examples of each type of sequence prediction problem.

Let’s get started.

Gentle Introduction to Making Predictions with Sequences

Gentle Introduction to Making Predictions with Sequences
Photo by, some rights reserved.

Tutorial Overview

This tutorial is divided into 5 parts; they are:

  1. Sequence
  2. Sequence Prediction
  3. Sequence Classification
  4. Sequence Generation
  5. Sequence to Sequence Prediction


Often we deal with sets in applied machine learning such as a train or test sets of samples.

Each sample in the set can be thought of as an observation from the domain.

In a set, the order of the observations is not important.

A sequence is different. The sequence imposes an explicit order on the observations.

The order is important. It must be respected in the formulation of prediction problems that use the sequence data as input or output for the model.

Sequence Prediction

Sequence prediction involves predicting the next value for a given input sequence.

For example:

  • Given: 1, 2, 3, 4, 5
  • Predict: 6
Example of a Sequence Prediction Problem

Example of a Sequence Prediction Problem

Sequence prediction attempts to predict elements of a sequence on the basis of the preceding elements

Sequence Learning: From Recognition and Prediction to Sequential Decision Making, 2001.

A prediction model is trained with a set of training sequences. Once trained, the model is used to perform sequence predictions. A prediction consists in predicting the next items of a sequence. This task has numerous applications such as web page prefetching, consumer product recommendation, weather forecasting and stock market prediction.

CPT+: Decreasing the time/space complexity of the Compact Prediction Tree, 2015

Sequence prediction may also generally be referred to as “sequence learning“.

Learning of sequential data continues to be a fundamental task and a challenge in pattern recognition and machine learning. Applications involving sequential data may require prediction of new events, generation of new sequences, or decision making such as classification of sequences or sub-sequences.

On Prediction Using Variable Order Markov Models, 2004.

Technically, we could refer to all of the following problems in this post as a type of sequence prediction problem. This can make things confusing for beginners.

Some examples of sequence prediction problems include:

  • Weather Forecasting. Given a sequence of observations about the weather over time, predict the expected weather tomorrow.
  • Stock Market Prediction. Given a sequence of movements of a security over time, predict the next movement of the security.
  • Product Recommendation. Given a sequence of past purchases of a customer, predict the next purchase of a customer.

Sequence Classification

Sequence classification involves predicting a class label for a given input sequence.

For example:

  • Given: 1, 2, 3, 4, 5
  • Predict: “good” or “bad”
Example of a Sequence Classification Problem

Example of a Sequence Classification Problem

The objective of sequence classification is to build a classification model using a labeled dataset D so that the model can be used to predict the class label of an unseen sequence.

— Chapter 14, Data Classification: Algorithms and Applications, 2015

The input sequence may be comprised of real values or discrete values. In the latter case, such problems may be referred to as discrete sequence classification.

Some examples of sequence classification problems include:

  • DNA Sequence Classification. Given a DNA sequence of ACGT values, predict whether the sequence codes for a coding or non-coding region.
  • Anomaly Detection. Given a sequence of observations, predict whether the sequence is anomalous or not.
  • Sentiment Analysis. Given a sequence of text such as a review or a tweet, predict whether sentiment of the text is positive or negative.

Sequence Generation

Sequence generation involves generating a new output sequence that has the same general characteristics as other sequences in the corpus.

For example:

  • Given: [1, 3, 5], [7, 9, 11]
  • Predict: [3, 5 ,7]

[recurrent neural networks] can be trained for sequence generation by processing real data sequences one step at a time and predicting what comes next. Assuming the predictions are probabilistic, novel sequences can be generated from a trained network by iteratively sampling from the network’s output distribution, then feeding in the sample as input at the next step. In other words by making the network treat its inventions as if they were real, much like a person dreaming

Generating Sequences With Recurrent Neural Networks, 2013.

Some examples of sequence generation problems include:

  • Text Generation. Given a corpus of text, such as the works of Shakespeare, generate new sentences or paragraphs of text that read like Shakespeare.
  • Handwriting Prediction. Given a corpus of handwriting examples, generate handwriting for new phrases that has the properties of handwriting in the corpus.
  • Music Generation. Given a corpus of examples of music, generate new musical pieces that have the properties of the corpus.

Sequence generation may also refer to the generation of a sequence given a single observation as input.

An example is the automatic textual description of images.

  • Image Caption Generation. Given an image as input, generate a sequence of words that describe an image.
Example of a Sequence Generation Problem

Example of a Sequence Generation Problem

Being able to automatically describe the content of an image using properly formed English sentences is a very challenging task, but it could have great impact, for instance by helping visually impaired people better understand the content of images on the web. […] Indeed, a description must capture not only the objects contained in an image, but it also must express how these objects relate to each other as well as their attributes and the activities they are involved in. Moreover, the above semantic knowledge has to be expressed in a natural language like English, which means that a language model is needed in addition to visual understanding.

Show and Tell: A Neural Image Caption Generator, 2015

Sequence-to-Sequence Prediction

Sequence-to-sequence prediction involves predicting an output sequence given an input sequence.

For example:

  • Given: 1, 2, 3, 4, 5
  • Predict: 6, 7, 8, 9, 10
Example of a Sequence-to-Sequence Prediction Problem.png

Example of a Sequence-to-Sequence Prediction Problem

Despite their flexibility and power, [deep neural networks] can only be applied to problems whose inputs and targets can be sensibly encoded with vectors of fixed dimensionality. It is a significant limitation, since many important problems are best expressed with sequences whose lengths are not known a-priori. For example, speech recognition and machine translation are sequential problems. Likewise, question answering can also be seen as mapping a sequence of words representing the question to a sequence of words representing the answer.

Sequence to Sequence Learning with Neural Networks, 2014

It is a subtle but challenging extension of sequence prediction where rather than predicting a single next value in the sequence, a new sequence is predicted that may or may not have the same length or be of the same time as the input sequence.

This type of problem has recently seen a lot of study in the area of automatic text translation (e.g. translating English to French) and may be referred to by the abbreviation seq2seq.

seq2seq learning, at its core, uses recurrent neural networks to map variable-length input sequences to variable-length output sequences. While relatively new, the seq2seq approach has achieved state-of-the-art results in not only its original application – machine translation.

Multi-task Sequence to Sequence Learning, 2016.

If the input and output sequences are a time series, then the problem may be referred to as multi-step time series forecasting.

  • Multi-Step Time Series Forecasting. Given a time series of observations, predict a sequence of observations for a range of future time steps.
  • Text Summarization. Given a document of text, predict a shorter sequence of text that describes the salient parts of the source document.
  • Program Execution. Given the textual description program or mathematical equation, predict the sequence of characters that describes the correct output.

Further Reading

This section provides more resources on the topic if you are looking go deeper.


In this tutorial, you discovered the different types of sequence prediction problems.

Specifically, you learned:

  • The 4 types of sequence prediction problems.
  • Definitions for each type of sequence prediction problem by the experts.
  • Real-world examples of each type of sequence prediction problem.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Develop LSTMs for Sequence Prediction Today!

Long Short-Term Memory Networks with Python

Develop Your Own LSTM models in Minutes

…with just a few lines of python code

Discover how in my new Ebook:
Long Short-Term Memory Networks with Python

It provides self-study tutorials on topics like:
CNN LSTMs, Encoder-Decoder LSTMs, generative models, data preparation, making predictions and much more…

Finally Bring LSTM Recurrent Neural Networks to
Your Sequence Predictions Projects

Skip the Academics. Just Results.

Click to learn more.

19 Responses to Making Predictions with Sequences

  1. Mike September 24, 2017 at 9:53 pm #

    So I assume it’s fair to say that every time-series is an example of sequence prediction but not vice-versa? Thanks for the interesting post.

  2. Bushra October 8, 2017 at 5:41 am #

    So, can we say that problems like 20-question game require sequence prediction to solve? and we can use recurrent neural network to implement?

    The system asks questions and after each answer, we predict an answer which helps to determine the next question. Right?

    Thanks, that was exactly what I need.

    • Jason Brownlee October 8, 2017 at 8:43 am #

      I expect Q&A is a sequence prediction problem.

      I have not worked on an example so I cannot give you advice about whether RNNs are appropriate. I would recommend a search on google scholar.

  3. Long October 14, 2017 at 9:02 pm #

    Hi Jason.

    Could LSTM do multi-step forecasts? I have two examples below:

    1. input the [1,2,3] sequence to predict the [4,5,6,7,8,9,10,…15] sequence;
    2. input the [1,2,3] sequence to predict the [10,11,12] sequence.

    If LSTM can do, could you give a lesson on this kind of problems?

    Thank you very much.

  4. Oscar Reyes November 23, 2017 at 11:29 pm #

    Hello Jason,

    Thank you for this post, it is very useful and interesting.

    I´m thinking about the following problem…, Given a single input sequence, we want to predict several sequences, that can be of different lengths. For instance, this problem can be encountered in the Alternative Splicing phenomenon, where given a single RNA sequence, we can obtain multiple proteins.

    My questions are:

    1- Have the problem “Input: One sequence -> Output: Several sequences” been studied in the literature?
    2- Can LSTMs solve this type of problem?

    Best and thanks

  5. Bill Coupe December 5, 2017 at 2:26 am #

    Jason –

    I enjoyed this post and I believe it may help me solve a predictive problem I’ve been pondering.

    The data is primarily text based, time series data involving an ‘actor’ object that I receive information on. That information, other than the date/time information is also text. I know that given information sequence ‘A’ that the next informational sequence is most often ‘B’. However there may well be several other sequences that are also highly likely.

    What I’m looking for is a learning method that can identify anomalous information reports so they can be reviewed and subsequently validated as either truly anomalous or potentially a new, yet valid, sequenced item.

    Anything you might be able to point me towards would be greatly appreciated.


    • Jason Brownlee December 5, 2017 at 5:45 am #

      I would recommend investigating the field of time series anomaly detection. Perhaps start on google scholar?

      • Bill Coupe December 6, 2017 at 12:54 am #

        Thanks Jason, I spent a considerable amount of time yesterday looking into what you suggested.

        Just to clarify, the timestamps only serve to order the reports as they arrive, they have little significance beyond that.

        Do any of your publications deal with pointing an unsupervised, or minimally supervised, method at this sort of data? As opposed to say numeric data?

        I’ve done a considerable amount of ‘crunching’ of the data (it’s billions of rows) and have built a reference table of the likely ‘next event’ given the previous event. However that solution is not as robust, nor as flexible as I’d like it to be.

        LSTM and GAN appear to show promise for what I’m trying to do yet most of the examples I’ve seen don’t seem to fit very well with the data I have to work with.

        Again, I will appreciate any insight you could share.


        • Jason Brownlee December 6, 2017 at 9:05 am #

          Sorry, I don’t have material on semi-supervised learning at this stage, I hope to cover it in the future.

          I would recommend testing a suite of methods as well as a suite of different framings of the problem to see what works best.

  6. Bill Coupe December 7, 2017 at 3:01 am #

    Thanks Jason!

  7. Mohit Rajpoot December 21, 2017 at 11:16 pm #

    Hello Jason,
    Thank you for such informative article.
    But I am not able to fit a prediction problem I’ve been working on in any category you have mentioned.

    I have data of a person who visits certain places in a sequence from a sample of places.
    let’s say he wants to visit [‘NY’, ‘LA’, ‘DC’, ‘TX’, ‘FL’] then he’ll visit it in this sequence [‘TX’, ‘LA’, ‘NY’,’FL’, ‘DC’].
    I have historical data of his previous visits in sequence.
    [‘TX’, ‘LA’, ‘NY’,’FL’, ‘DC’]
    [‘AK’, ‘FL’, ‘NY’] and so on.
    so for a random list of places i need to predict in which sequence he is gonna visit those places.

    I’ll really appreciate if you can point me toward something.

  8. Sharan December 30, 2017 at 12:11 am #

    Hi Jason,

    My interest in ML is application part of it. I am from VLSI field.
    The area of ML is very vast and I don’t know where to start with for my problem.

    Below is a brief description of my problem.

    The system i am testing basically generate events. Sequence of these are of interest to me.
    One can manually look at these event sequence and recognize them to be useful. But manual process is very cumbersome and also there could be millions of events within which one has to look for interesting events.

    The interesting event sequence are known a-priori. The spacing between these events can vary though.

    Do you have any suggestion as to what I should be trying out to begin with?
    I am not looking for solutions actually but only for guidance.

Leave a Reply