Using CNN for financial time series prediction

Last Updated on November 20, 2021

Convolutional neural networks have their roots in image processing. It was first published in LeNet to recognize the MNIST handwritten digits. However, convolutional neural networks are not limited to handling images.

In this tutorial, we are going to look at an example of using CNN for time series prediction with an application from financial markets. By way of this example, we are going to explore some techniques in using Keras for model training as well.

After completing this tutorial, you will know

  • What a typical multidimensional financial data series looks like?
  • How can CNN applied to time series in a classification problem
  • How to use generators to feed data to train a Keras model
  • How to provide a custom metric for evaluating a Keras model

Let’s get started

Using CNN for financial time series prediction

Using CNN for financial time series prediction
Photo by Aron Visuals, some rights reserved.

Tutorial overview

This tutorial is divided into 7 parts; they are:

  1. Background of the idea
  2. Preprocessing of data
  3. Data generator
  4. The model
  5. Training, validation, and test
  6. Extensions
  7. Does it work?

Background of the idea

In this tutorial we are following the paper titled “CNNpred: CNN-based stock market prediction using a iverse set of variables” by Ehsan Hoseinzade and Saman Haratizadeh. The data file and sample code from the author are available in github:

The goal of the paper is simple: To predict the next day’s direction of the stock market (i.e., up or down compared to today), hence it is a binary classification problem. However, it is interesting to see how this problem are formulated and solved.

We have seen the examples on using CNN for sequence prediction. If we consider Dow Jones Industrial Average (DJIA) as an example, we may build a CNN with 1D convolution for prediction. This makes sense because a 1D convolution on a time series is roughly computing its moving average or using digital signal processing terms, applying a filter to the time series. It should provide some clues about the trend.

However, when we look at financial time series, it is quite a common sense that some derived signals are useful for predictions too. For example, price and volume together can provide a better clue. Also some other technical indicators such as the moving average of different window size are useful too. If we put all these align together, we will have a table of data, which each time instance has multiple features, and the goal is still to predict the direction of one time series.

In the CNNpred paper, 82 such features are prepared for the DJIA time series:

Excerpt from the CNNpred paper showing the list of features used.

Unlike LSTM, which there is an explicit concept of time steps applied, we present data as a matrix in CNN models. As shown in the table below, the features across multiple time steps are presented as a 2D array.

Preprocessing of data

In the following, we try to implement the idea of the CNNpred from scratch using Tensorflow’s keras API. While there is a reference implementation from the author in the github link above, we reimplement it differently to illustrate some Keras techniques.

Firstly the data are five CSV files, each for a different market index, under the Dataset directory from github repository above, or we can also get a copy here:

The input data has a date column and a name column to identify the ticker symbol for the market index. We can leave the date column as time index and remove the name column. The rest are all numerical.

As we are going to predict the market direction, we first try to create the classification label. The market direction is defined as the closing index of tomorrow compared to today. If we have read the data into a pandas DataFrame, we can use X["Close"].pct_change() to find the percentage change, which a positive change for the market goes up. So we can shift this to one time step back as our label:

The line of code above is to compute the percentage change of the closing index and align the data with the previous day. Then convert the data into either 1 or 0 for whether the percentage change is positive.

For five data file in the directory, we read each of them as a separate pandas DataFrame and keep them in a Python dictionary:

The result of the above code is a DataFrame for each index, which the classification label is the column “Target” while all other columns are input features. We also normalize the data with a standard scaler.

In time series problems, it is generally reasonable not to split the data into training and test sets randomly, but to set up a cutoff point in which the data before the cutoff is training set while that afterwards is the test set. The scaling above are based on the training set but applied to the entire dataset.

Data generator

We are not going to use all time steps at once, but instead, we use a fixed length of N time steps to predict the market direction at step N+1. In this design, the window of N time steps can start from anywhere. We can just create a large number of DataFrames with large amount of overlaps with one another. To save memory, we are going to build a data generator for training and validation, as follows:

Generator is a special function in Python that does not return a value but to yield in iterations, such that a sequence of data are produced from it. For a generator to be used in Keras training, it is expected to yield a batch of input data and target. This generator supposed to run indefinitely. Hence the generator function above is created with an infinite loop starts with while True.

In each iteration, it randomly pick one DataFrame from the Python dictionary, then within the range of time steps of the training set (i.e., the beginning portion), we start from a random point and take N time steps using the pandas iloc[start:end] syntax to create a input under the variable frame. This DataFrame will be a 2D array. The target label is that of the last time step. The input data and the label are then appended to the list batch. Until we accumulated for one batch’s size, we dispatch it from the generator.

The last four lines at the code snippet above is to dispatch a batch for training or validation. We collect the list of input data (each a 2D array) as well as a list of target label into variables X and y, then convert them into numpy array so it can work with our Keras model. We need to add one more dimension to the numpy array X using np.expand_dims() because of the design of the network model, as explained below.

The Model

The 2D CNN model presented in the original paper accepts an input tensor of shape $N\times m \times 1$ for N the number of time steps and m the number of features in each time step. The paper assumes $N=60$ and $m=82$.

The model comprises of three convolutional layers, as described as follows:

and the model is presented by the following:

The first convolutional layer has 8 units, and is applied across all features in each time step. It is followed by a second convolutional layer to consider three consecutive days at once, for it is a common belief that three days can make a trend in the stock market. It is then applied to a max pooling layer and another convolutional layer before it is flattened into a one-dimensional array and applied to a fully-connected layer with sigmoid activation for binary classification.

Training, validation, and test

That’s it for the model. The paper used MAE as the loss metric and also monitor for accuracy and F1 score to determine the quality of the model. We should point out that F1 score depends on precision and recall ratios, which are both considering the positive classification. The paper, however, consider the average of the F1 from positive and negative classification. Explicitly, it is the F1-macro metric:
F_1 = \frac{1}{2}\left(
\frac{2\cdot \frac{TP}{TP+FP} \cdot \frac{TP}{TP+FN}}{\frac{TP}{TP+FP} + \frac{TP}{TP+FN}}
\frac{2\cdot \frac{TN}{TN+FN} \cdot \frac{TN}{TN+FP}}{\frac{TN}{TN+FN} + \frac{TN}{TN+FP}}
The fraction $\frac{TP}{TP+FP}$ is the precision with TP and FP the number of true positive and false positive. Similarly $\frac{TP}{TP+FN}$ is the recall. The first term in the big parenthesis above is the normal F1 metric that considered positive classifications. And the second term is the reverse, which considered the negative classifications.

While this metric is available in scikit-learn as sklearn.metrics.f1_score() there is no equivalent in Keras. Hence we would create our own by borrowing code from this stackexchange question:

The training process can take hours to complete. Hence we want to save the model in the middle of the training so that we may interrupt and resume it. We can make use of checkpoint features in Keras:

We set up a filename template checkpoint_path and ask Keras to fill in the epoch number as well as validation F1 score into the filename. We save it by monitoring the validation’s F1 metric, and this metric is supposed to increase when the model gets better. Hence we pass in the mode="max" to it.

It should now be trivial to train our model, as follows:

Two points to note in the above snippets. We supplied "acc" as the accuracy as well as the function f1macro defined above as the metrics parameter to the compile() function. Hence these two metrics will be monitored during training. Because the function is named f1macro, we refer to this metric in the checkpoint’s monitor parameter as val_f1macro.

Separately, in the fit() function, we provided the input data through the datagen() generator as defined above. Calling this function will produce a generator, which during the training loop, batches are fetched from it one after another. Similarly, validation data are also provided by the generator.

Because the nature of a generator is to dispatch data indefinitely. We need to tell the training process on how to define a epoch. Recall that in Keras terms, a batch is one iteration of doing gradient descent update. An epoch is supposed to be one cycle through all data in the dataset. At the end of an epoch is the time to run validation. It is also the opportunity for running the checkpoint we defined above. As Keras has no way to infer the size of the dataset from a generator, we need to tell how many batch it should process in one epoch using the steps_per_epoch parameter. Similarly, it is the validation_steps parameter to tell how many batch are used in each validation step. The validation does not affect the training, but it will report to us the metrics we are interested. Below is a screenshot of what we will see in the middle of training, which we will see that the metric for training set are updated on each batch but that for validation set is provided only at the end of epoch:

After the model finished training, we can test it with unseen data, i.e., the test set. Instead of generating the test set randomly, we create it from the dataset in a deterministic way:

The structure of the function testgen() is resembling that of datagen() we defined above. Except in datagen() the output data’s first dimension is the number of samples in a batch but in testgen() is the the entire test samples.

Using the model for prediction will produce a floating point between 0 and 1 as we are using the sigmoid activation function. We will convert this into 0 or 1 by using the threshold at 0.5. Then we use the functions from scikit-learn to compute the accuracy, mean absolute error and F1 score (which accuracy is just one minus the MAE).

Tying all these together, the complete code is as follows:


The original paper called the above model “2D-CNNpred” and there is a version called “3D-CNNpred”. The idea is not only consider the many features of one stock market index but cross compare with many market indices to help prediction on one index. Refer to the table of features and time steps above, the data for one market index is presented as 2D array. If we stack up multiple such data from different indices, we constructed a 3D array. While the target label is the same, but allowing us to look at a different market may provide some additional information to help prediction.

Because the shape of the data changed, the convolutional network also defined slightly different, and the data generators need some modification accordingly as well. Below is the complete code of the 3D version, which the change from the previous 2d version should be self-explanatory:

While the model above is for next-step prediction, it does not stop you from making prediction for k steps ahead if you replace the target label to a different calculation. This may be an exercise for you.

Does it work?

As in all prediction projects in the financial market, it is always unrealistic to expect a high accuracy. The training parameter in the code above can produce slightly more than 50% accuracy in the testing set. While the number of epochs and batch size are deliberately set smaller to save time, there should not be much room for improvement.

In the original paper, it is reported that the 3D-CNNpred performed better than 2D-CNNpred but only attaining the F1 score of less than 0.6. This is already doing better than three baseline models mentioned in the paper. It may be of some use, but not a magic that can help you make money quick.

From machine learning technique perspective, here we classify a panel of data into whether the market direction is up or down the next day. Hence while the data is not an image, it resembles one since both are presented in the form of a 2D array. The technique of convolutional layers can therefore applied, but we may use a different filter size to match the intuition we usually have for financial time series.

Further readings

The original paper is available at:

If you are new to finance application and want to build the connection between machine learning techniques and finance, you may find this book useful:

On the similar topic, we have a previous post on using CNN for time series, but using 1D convolutional layers;

You may also find the following documentation helpful to explain some syntax we used above:


In this tutorial, you discovered how a CNN model can be built for prediction in financial time series.

Specifically, you learned:

  • How to create 2D convolutional layers to process the time series
  • How to present the time series data in a multidimensional array so that the convolutional layers can be applied
  • What is a data generator for Keras model training and how to use it
  • How to monitor the performance of model training with a custom metric
  • What to expect in predicting financial market

32 Responses to Using CNN for financial time series prediction

  1. Avatar
    Paul Kornreich November 19, 2021 at 2:40 am #

    This has promise,, especially using multiple parameters, but, in general, CNNs are losing out in accuracy compared with Transformers.

    • Avatar
      Adrian Tam November 19, 2021 at 10:31 am #

      Correct. But this shows how simple it is to get something not too far away.

  2. Avatar
    tanunchai November 19, 2021 at 5:56 am #

    There is error at a line in def datagen() on the following:

    index = data.index[data.index < TRAIN_TEST_CUTOFF]

    compliler said

    “File “C:\Users\TANUNC~1.J\AppData\Local\Temp/ipykernel_10252/”, line 66
    index = data.index[data.index < TRAIN_TEST_CUTOFF]
    SyntaxError: invalid syntax

    How to solve it ?

    waiting your answer, Thanks in advance

    • Avatar
      Adrian Tam November 19, 2021 at 10:39 am #

      The line seems OK but maybe the line before it in your code caused some problem. Can you check if you copied the code correctly?

  3. Avatar
    William Smith November 19, 2021 at 9:44 am #

    The code above seems to say

    index = data.index[data.index < TRAIN_TEST_CUTOFF]

    Try changing to

    index = data.index[data.index < TRAIN_TEST_CUTOFF]

  4. Avatar
    William Smith November 19, 2021 at 9:45 am #

    Argh. The code actually says [data.index & l t ; TRAIN_TEST_CUTOFF]

    Change to

    index = data.index[data.index < TRAIN_TEST_CUTOFF]

    • Avatar
      Adrian Tam November 19, 2021 at 10:57 am #

      Thanks William. There were some hassles with the HTML. I fixed that and the copy should work now.

    • Avatar
      tanunchai November 19, 2021 at 10:09 pm #

      Thanks you William Smith

  5. Avatar
    tanunchai November 19, 2021 at 10:17 pm #

    How to solve this bug ?

    Now facing new problem, in def datagen() at nest loop while true
    has the problem on the following:

    if n-seq_len+1 &; 0:

    while True:
    # Pick one position, then clip a sequence length
    while True:
    t = random.choice(index)
    n = (data.index == t).argmax()

    if n-seq_len+1 &; 0: # ****error said invalid syntax
    continue # this sample is not enough for one sequence length
    frame = data.iloc[n-seq_len+1:n+1][input_cols]
    # convert frame with two l

  6. Avatar
    tanunchai November 19, 2021 at 10:20 pm #

    Now facing new problem, in def datagen() at nest loop while true
    has the problem on the following:

    if n-seq_len+1 < 0:

    while True:
    # Pick one position, then clip a sequence length
    while True:
    t = random.choice(index)
    n = (data.index == t).argmax()
    #index = data.index[data.index < TRAIN_TEST_CUTOFF] , the line below has the problem
    if n-seq_len+1 < 0:
    continue # this sample is not enough for one sequence length
    frame = data.iloc[n-seq_len+1:n+1][input_cols]
    # convert frame with two level of indices into 3D array
    shape = (len(tickers), len(frame), n_features)
    X = np.full(shape, np.nan)
    for i,ticker in enumerate(tickers):

  7. Avatar
    tanunchai November 19, 2021 at 10:26 pm #

    at def datagen() still has problem on the following;

    if n-seq_len+1 < 0: # Error line , say “invalid syntax”

    &lt doing what ? I do not understand and it made error also.


    while True:
    # Pick one position, then clip a sequence length
    while True:
    t = random.choice(index)
    n = (data.index == t).argmax()
    #index = data.index[data.index < TRAIN_TEST_CUTOFF] , the line below has the problem
    if n-seq_len+1 < 0: # Error line , say "invalid syntax"
    continue # this sample is not enough for one sequence length
    frame = data.iloc[n-seq_len+1:n+1][input_cols]
    # convert frame with two level of indices into 3D array
    shape = (len(tickers), len(frame), n_features)

  8. Avatar
    tanunchai November 19, 2021 at 10:34 pm #

    ModuleNotFoundError: No module named ‘f1metrics’

    How to solve this bug ?

    ModuleNotFoundError Traceback (most recent call last)
    C:\Users\TANUNC~1.J\AppData\Local\Temp/ipykernel_2580/ in
    12 from sklearn.metrics import accuracy_score, f1_score, mean_absolute_error
    —> 14 from f1metrics import f1macro
    16 DATADIR = “./Dataset”

    ModuleNotFoundError: No module named ‘f1metrics’


    • Avatar
      Adrian Tam November 20, 2021 at 2:33 am #

      Sorry for all these hassles. Something wrong with the plugin for rendering the code here caused all the mess. I fixed it for now, so please copy over the code and try again.

  9. Avatar
    Jack November 20, 2021 at 5:39 am #

    Interesting! What is the difference between using:

    – conv1d with kernel_size=n_features and input size N x m
    – conv2d with kernel_size=(1, n_features) and input size N x m x 1

    Are these two equivalent?

    • Avatar
      Adrian Tam November 20, 2021 at 1:36 pm #

      I would say yes but the best way to test this (in Keras) is to build such layers into a model and run “model.summary()” to observe the output. Are the output shape the same?

  10. Avatar
    Doni November 21, 2021 at 2:52 pm #

    how to deploy this code on the web app. ?

    • Avatar
      Adrian Tam November 23, 2021 at 1:06 pm #

      That’s a vague question – you need to think about what the web app expects and how to wrap the model into a function to talk to the web app

  11. Avatar
    kaplan November 29, 2021 at 7:36 am #

    Is FFDNet (fast and flexible denoising convolutional neural networks) suitable for financial time series?

    • Avatar
      Adrian Tam November 29, 2021 at 8:56 am #

      Haven’t tried that.

  12. Avatar
    Priya December 1, 2021 at 9:14 pm #

    My question is from the blog
    Sorry for asking this question here.

    My work is to develop a model for multioutput prediction (i.e., predicting five outputs by a single model). When I applied Kbest and recursive feature elimination methods to select the best features, I got an error ‘ bad input shape (x, 5)’ (5 is output vectors here). However, PCA works well as it doesn’t depend upon the output vector.
    Does it mean that we can apply these feature selection algorithms (Kbest and RFE) only for a single output prediction problem?

    • Avatar
      Adrian Tam December 2, 2021 at 2:55 am #

      I answered it on the other post.

  13. Avatar
    Pete March 28, 2022 at 4:32 am #

    I think there is too much information in the csv file. Instead of looking too much data and try to make a fundamental analysis it can also be tried to find a ‘pattern’. With pattern I mean looking only to the open price, close price, max price and min price information of each candlestick.

    For example:

    Let’s consider the candlestick chart of D1 (1 candlestick = 1 day info). Objective: predict the fifth candlestick (up or down) with the info of the 4 formed before. The less quantity of candlestick the easier a pattern can be found

    I will try modifying the input data with this model and let you know the results

    • Avatar
      James Carmichael March 28, 2022 at 7:05 am #

      Thanks for the feedback Pete! Please share your findings.

  14. Avatar
    Bojie May 10, 2022 at 8:28 am #

    Many thanks for your tutorial, which is very helpful.

    Can you please let me know how to write a data generator to read all different classes of .txt files (likes the function of “datagen.flow_from_directory” which read all different classes of image files)?

  15. Avatar
    Yu June 15, 2022 at 4:10 am #

    Thanks for this useful article!

    I was trying to run your codes a couple of times but there are certain instability issues happened. Sometimes the prediction rate is about 51%, which is good result, but there are also times the prediction rate will go down to 48%. My guess is that this is related to the initialization step. Do you have any insights on this issue?

  16. Avatar
    rk November 26, 2022 at 5:55 am #

    Hi, Could you help me i am having an issue with the code

    InvalidArgumentError: Graph execution error

    It is associated with [[, seq_len, batch_size,”Target”,”train”),
    validation_data=datagen(data, seq_len, batch_size, “Target”, “valid”),
    epochs=n_epochs, steps_per_epoch=400, validation_steps=10, verbose=1,
    callbacks=callbacks) ]]

    I was able to trace it back to this line within the code in datagen (2DConv):

    input_cols = [c for c in df.columns if c != targetcol]

    Im using google colab and it says that when i try to single it out and run it

    ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

    After trying: input_cols = [c for c in df.columns if c != targetcol.all()]

    The issue persists, i would appreciate your help on it

    • Avatar
      James Carmichael November 26, 2022 at 8:22 am #

      Hi rk…did you copy and paste the code or type it in? Also, you may want to try to execute your code in Google Colab.

  17. Avatar
    de santos March 25, 2023 at 8:03 pm #

    Hello, Can you tell me what the following lines of code do? They are:

    name = X[“Name”][0]
    del X[“Name”]
    cols = X.columns
    X[“Target”] = (X[“Close”].pct_change().shift(-1) > 0).astype(int)

    Since I get a bug here on google colab that return the error :

    KeyError : “name”

    This only happen when I use my own .csv data files. When I ran the codes using your dataset .csv files, I did not encounter any problem though. My csv files only have 7 columns : Date,Close,Open, HIgh, Low, Vol, Change %.
    From what I understood, you are trying to drop the other columns but the Date and Close columns, right?
    Thanks in advance!

    • Avatar
      James Carmichael March 26, 2023 at 10:33 am #

      Hi de santos…You are correct. Can you provide any more detail about the error? Is that the entire error message?

      • Avatar
        de santos June 27, 2023 at 6:57 pm #

        Thanks for replying, James! I thought my response was not approved,hence the late reply 4 months later, I’m so sorry.There was another error about Integer data type since originally the Vol column had a word “k” as substitute for “thousand”, that can be easily fixed by Find/Replace function of Excel to uniform the data into integer numbers. Other than that, the “Name” error still persists, here is the full error message:

        KeyError Traceback (most recent call last)

        /usr/local/lib/python3.10/dist-packages/pandas/core/indexes/ in get_loc(self, key, method, tolerance)
        3801 try:
        -> 3802 return self._engine.get_loc(casted_key)
        3803 except KeyError as err:

        4 frames

        pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

        pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

        KeyError: ‘Name’

        The above exception was the direct cause of the following exception:

        KeyError Traceback (most recent call last)

        /usr/local/lib/python3.10/dist-packages/pandas/core/indexes/ in get_loc(self, key, method, tolerance)
        3802 return self._engine.get_loc(casted_key)
        3803 except KeyError as err:
        -> 3804 raise KeyError(key) from err
        3805 except TypeError:
        3806 # If we have a listlike key, _check_indexing_error will raise

        KeyError: ‘Name’

Leave a Reply