How to Get Started with Deep Learning for Time Series Forecasting (7-Day Mini-Course)

By Jason Brownlee on August 5, 2019 in Deep Learning for Time Series 181

Deep Learning for Time Series Forecasting Crash Course.

Bring Deep Learning methods to Your Time Series project in 7 Days.

Time series forecasting is challenging, especially when working with long sequences, noisy data, multi-step forecasts and multiple input and output variables.

Deep learning methods offer a lot of promise for time series forecasting, such as the automatic learning of temporal dependence and the automatic handling of temporal structures like trends and seasonality.

In this crash course, you will discover how you can get started and confidently develop deep learning models for time series forecasting problems using Python in 7 days.

This is a big and important post. You might want to bookmark it.

Kick-start your project with my new book Deep Learning for Time Series Forecasting, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

How to Get Started with Deep Learning for Time Series Forecasting (7-Day Mini-Course)
Photo by Brian Richardson, some rights reserved.

Who Is This Crash-Course For?

Before we get started, let’s make sure you are in the right place.

The list below provides some general guidelines as to who this course was designed for.

You need to know:

You need to know the basics of time series forecasting.
You need to know your way around basic Python, NumPy and Keras for deep learning.

You do NOT need to know:

You do not need to be a math wiz!
You do not need to be a deep learning expert!
You do not need to be a time series expert!

This crash course will take you from a developer that knows a little machine learning to a developer who can bring deep learning methods to your own time series forecasting project.

Note: This crash course assumes you have a working Python 2 or 3 SciPy environment with at least NumPy and Keras 2 installed. If you need help with your environment, you can follow the step-by-step tutorial here:

How to Setup a Python Environment for Machine Learning and Deep Learning with Anaconda

Crash-Course Overview

This crash course is broken down into 7 lessons.

You could complete one lesson per day (recommended) or complete all of the lessons in one day (hardcore). It really depends on the time you have available and your level of enthusiasm.

Below are 7 lessons that will get you started and productive with deep learning for time series forecasting in Python:

Lesson 01: Promise of Deep Learning
Lesson 02: How to Transform Data for Time Series
Lesson 03: MLP for Time Series Forecasting
Lesson 04: CNN for Time Series Forecasting
Lesson 05: LSTM for Time Series Forecasting
Lesson 06: CNN-LSTM for Time Series Forecasting
Lesson 07: Encoder-Decoder LSTM Multi-step Forecasting

Each lesson could take you 60 seconds or up to 30 minutes. Take your time and complete the lessons at your own pace. Ask questions and even post results in the comments below.

The lessons expect you to go off and find out how to do things. I will give you hints, but part of the point of each lesson is to force you to learn where to go to look for help on and about the deep learning, time series forecasting and the best-of-breed tools in Python (hint, I have all of the answers directly on this blog, use the search box).

I do provide more help in the form of links to related posts because I want you to build up some confidence and inertia.

Post your results in the comments, I’ll cheer you on!

Hang in there, don’t give up.

Note: This is just a crash course. For a lot more detail and 25 fleshed out tutorials, see my book on the topic titled “Deep Learning for Time Series Forecasting“.

Need help with Deep Learning for Time Series?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Lesson 01: Promise of Deep Learning

In this lesson, you will discover the promise of deep learning methods for time series forecasting.

Generally, neural networks like Multilayer Perceptrons or MLPs provide capabilities that are offered by few algorithms, such as:

Robust to Noise. Neural networks are robust to noise in input data and in the mapping function and can even support learning and prediction in the presence of missing values.
Nonlinear. Neural networks do not make strong assumptions about the mapping function and readily learn linear and nonlinear relationships.
Multivariate Inputs. An arbitrary number of input features can be specified, providing direct support for multivariate forecasting.
Multi-step Forecasts. An arbitrary number of output values can be specified, providing
direct support for multi-step and even multivariate forecasting.

For these capabilities alone, feedforward neural networks may be useful for time series forecasting.

Your Task

For this lesson you must suggest one capability from both Convolutional Neural Networks and Recurrent Neural Networks that may be beneficial in modeling time series forecasting problems.

Post your answer in the comments below. I would love to see what you discover.

More Information

The Promise of Recurrent Neural Networks for Time Series Forecasting

In the next lesson, you will discover how to transform time series data for time series forecasting.

Lesson 02: How to Transform Data for Time Series

In this lesson, you will discover how to transform your time series data into a supervised learning format.

The majority of practical machine learning uses supervised learning.

Supervised learning is where you have input variables (X) and an output variable (y) and you use an algorithm to learn the mapping function from the input to the output. The goal is to approximate the real underlying mapping so well that when you have new input data, you can predict the output variables for that data.

Time series data can be phrased as supervised learning.

Given a sequence of numbers for a time series dataset, we can restructure the data to look like a supervised learning problem. We can do this by using previous time steps as input variables and use the next time step as the output variable.

For example, the series:

1, 2, 3, 4, 5, ...

1	1, 2, 3, 4, 5, ...

Can be transformed into samples with input and output components that can be used as part of a training set to train a supervised learning model like a deep learning neural network.

X,				y
[1, 2, 3]		4
[2, 3, 4]		5
...

X, y

[1, 2, 3] 4

[2, 3, 4] 5

...

This is called a sliding window transformation as it is just like sliding a window across prior observations that are used as inputs to the model in order to predict the next value in the series. In this case the window width is 3 time steps.

Your Task

For this lesson you must develop Python code to transform the daily female births dataset into a supervised learning format with some number of inputs and one output.

You can download the dataset from here: daily-total-female-births.csv

Post your answer in the comments below. I would love to see what you discover.

More Information

In the next lesson, you will discover how to develop a Multilayer Perceptron deep learning model for forecasting a univariate time series.

Lesson 03: MLP for Time Series Forecasting

In this lesson, you will discover how to develop a Multilayer Perceptron model or MLP for univariate time series forecasting.

We can define a simple univariate problem as a sequence of integers, fit the model on this sequence and have the model predict the next value in the sequence. We will frame the problem to have 3 inputs and 1 output, for example: [10, 20, 30] as input and [40] as output.

First, we can define the model. We will define the number of input time steps as 3 via the input_dim argument on the first hidden layer. In this case we will use the efficient Adam version of stochastic gradient descent and optimizes the mean squared error (‘mse‘) loss function.

Once the model is defined, it can be fit on the training data and the fit model can be used to make a prediction.

The complete example is listed below.

# univariate mlp example
from numpy import array
from keras.models import Sequential
from keras.layers import Dense
# define dataset
X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([40, 50, 60, 70])
# define model
model = Sequential()
model.add(Dense(100, activation='relu', input_dim=3))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3))
yhat = model.predict(x_input, verbose=0)
print(yhat)

# univariate mlp example

from numpy import array

from keras.models import Sequential

from keras.layers import Dense

# define dataset

X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])

y = array([40, 50, 60, 70])

# define model

model = Sequential()

model.add(Dense(100, activation='relu', input_dim=3))

model.add(Dense(1))

model.compile(optimizer='adam', loss='mse')

# fit model

model.fit(X, y, epochs=2000, verbose=0)

# demonstrate prediction

x_input = array([50, 60, 70])

x_input = x_input.reshape((1, 3))

yhat = model.predict(x_input, verbose=0)

print(yhat)

Running the example will fit the model on the data then predict the next out-of-sample value.

Given [50, 60, 70] as input, the model correctly predicts 80 as the next value in the sequence.

Your Task

For this lesson you must download the daily female births dataset, split it into train and test sets and develop a model that can make reasonably accurate predictions on the test set.

You can download the dataset from here: daily-total-female-births.csv

Post your answer in the comments below. I would love to see what you discover.

More Information

In the next lesson, you will discover how to develop a Convolutional Neural Network model for forecasting a univariate time series.

Lesson 04: CNN for Time Series Forecasting

In this lesson, you will discover how to develop a Convolutional Neural Network model or CNN for univariate time series forecasting.

An important difference from the MLP model is that the CNN model expects three-dimensional input with the shape [samples, timesteps, features]. We will define the data in the form [samples, timesteps] and reshape it accordingly.

We will define the number of input time steps as 3 and the number of features as 1 via the input_shape argument on the first hidden layer.

We will use one convolutional hidden layer followed by a max pooling layer. The filter maps are then flattened before being interpreted by a Dense layer and outputting a prediction. The model uses the efficient Adam version of stochastic gradient descent and optimizes the mean squared error (‘mse‘) loss function.

Once the model is defined, it can be fit on the training data and the fit model can be used to make a prediction.

The complete example is listed below.

# univariate cnn example
from numpy import array
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
# define dataset
X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([40, 50, 60, 70])
# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(3, 1)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

# univariate cnn example

from numpy import array

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.convolutional import Conv1D

from keras.layers.convolutional import MaxPooling1D

# define dataset

X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])

y = array([40, 50, 60, 70])

# reshape from [samples, timesteps] into [samples, timesteps, features]

X = X.reshape((X.shape[0], X.shape[1], 1))

# define model

model = Sequential()

model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(3, 1)))

model.add(MaxPooling1D(pool_size=2))

model.add(Flatten())

model.add(Dense(50, activation='relu'))

model.add(Dense(1))

model.compile(optimizer='adam', loss='mse')

# fit model

model.fit(X, y, epochs=1000, verbose=0)

# demonstrate prediction

x_input = array([50, 60, 70])

x_input = x_input.reshape((1, 3, 1))

yhat = model.predict(x_input, verbose=0)

print(yhat)

Running the example will fit the model on the data then predict the next out-of-sample value.

Given [50, 60, 70] as input, the model correctly predicts 80 as the next value in the sequence.

Your Task

For this lesson you must download the daily female births dataset, split it into train and test sets and develop a model that can make reasonably accurate predictions on the test set.

You can download the dataset from here: daily-total-female-births.csv

Post your answer in the comments below. I would love to see what you discover.

More Information

Crash Course in Convolutional Neural Networks for Machine Learning

In the next lesson, you will discover how to develop a Long Short-Term Memory network model for forecasting a univariate time series.

Lesson 05: LSTM for Time Series Forecasting

In this lesson, you will discover how to develop a Long Short-Term Memory Neural Network model or LSTM for univariate time series forecasting.

An important difference from the MLP model, and like the CNN model, is that the LSTM model expects three-dimensional input with the shape [samples, timesteps, features]. We will define the data in the form [samples, timesteps] and reshape it accordingly.

We will define the number of input time steps as 3 and the number of features as 1 via the input_shape argument on the first hidden layer.

We will use one LSTM layer to process each input sub-sequence of 3 time steps, followed by a Dense layer to interpret the summary of the input sequence. The model uses the efficient Adam version of stochastic gradient descent and optimizes the mean squared error (‘mse‘) loss function.

Once the model is defined, it can be fit on the training data and the fit model can be used to make a prediction.

The complete example is listed below.

# univariate lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
# define dataset
X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([40, 50, 60, 70])
# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

# univariate lstm example

from numpy import array

from keras.models import Sequential

from keras.layers import LSTM

from keras.layers import Dense

# define dataset

X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])

y = array([40, 50, 60, 70])

# reshape from [samples, timesteps] into [samples, timesteps, features]

X = X.reshape((X.shape[0], X.shape[1], 1))

# define model

model = Sequential()

model.add(LSTM(50, activation='relu', input_shape=(3, 1)))

model.add(Dense(1))

model.compile(optimizer='adam', loss='mse')

# fit model

model.fit(X, y, epochs=1000, verbose=0)

# demonstrate prediction

x_input = array([50, 60, 70])

x_input = x_input.reshape((1, 3, 1))

yhat = model.predict(x_input, verbose=0)

print(yhat)

Running the example will fit the model on the data then predict the next out-of-sample value.

Given [50, 60, 70] as input, the model correctly predicts 80 as the next value in the sequence.

Your Task

For this lesson you must download the daily female births dataset, split it into train and test sets and develop a model that can make reasonably accurate predictions on the test set.

You can download the dataset from here: daily-total-female-births.csv

Post your answer in the comments below. I would love to see what you discover.

More Information

In the next lesson, you will discover how to develop a hybrid CNN-LSTM model for a univariate time series forecasting problem.

Lesson 06: CNN-LSTM for Time Series Forecasting

In this lesson, you will discover how to develop a hybrid CNN-LSTM model for univariate time series forecasting.

The benefit of this model is that the model can support very long input sequences that can be read as blocks or subsequences by the CNN model, then pieced together by the LSTM model.

We can define a simple univariate problem as a sequence of integers, fit the model on this sequence and have the model predict the next value in the sequence. We will frame the problem to have 4 inputs and 1 output, for example: [10, 20, 30, 40] as input and [50] as output.

When using a hybrid CNN-LSTM model, we will further divide each sample into further subsequences. The CNN model will interpret each sub-sequence and the LSTM will piece together the interpretations from the subsequences. As such, we will split each sample into 2 subsequences of 2 times per subsequence.

The CNN will be defined to expect 2 time steps per subsequence with one feature. The entire CNN model is then wrapped in TimeDistributed wrapper layers so that it can be applied to each subsequence in the sample. The results are then interpreted by the LSTM layer before the model outputs a prediction.

The model uses the efficient Adam version of stochastic gradient descent and optimizes the mean squared error (‘mse’) loss function.

Once the model is defined, it can be fit on the training data and the fit model can be used to make a prediction.

The complete example is listed below.

# univariate cnn-lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import TimeDistributed
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
# define dataset
X = array([[10, 20, 30, 40], [20, 30, 40, 50], [30, 40, 50, 60], [40, 50, 60, 70]])
y = array([50, 60, 70, 80])
# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]
X = X.reshape((X.shape[0], 2, 2, 1))
# define model
model = Sequential()
model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, 2, 1)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70, 80])
x_input = x_input.reshape((1, 2, 2, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

# univariate cnn-lstm example

from numpy import array

from keras.models import Sequential

from keras.layers import LSTM

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers import TimeDistributed

from keras.layers.convolutional import Conv1D

from keras.layers.convolutional import MaxPooling1D

# define dataset

X = array([[10, 20, 30, 40], [20, 30, 40, 50], [30, 40, 50, 60], [40, 50, 60, 70]])

y = array([50, 60, 70, 80])

# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]

X = X.reshape((X.shape[0], 2, 2, 1))

# define model

model = Sequential()

model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, 2, 1)))

model.add(TimeDistributed(MaxPooling1D(pool_size=2)))

model.add(TimeDistributed(Flatten()))

model.add(LSTM(50, activation='relu'))

model.add(Dense(1))

model.compile(optimizer='adam', loss='mse')

# fit model

model.fit(X, y, epochs=500, verbose=0)

# demonstrate prediction

x_input = array([50, 60, 70, 80])

x_input = x_input.reshape((1, 2, 2, 1))

yhat = model.predict(x_input, verbose=0)

print(yhat)

Running the example will fit the model on the data then predict the next out-of-sample value.

Given [50, 60, 70, 80] as input, the model correctly predicts 90 as the next value in the sequence.

Your Task

For this lesson you must download the daily female births dataset, split it into train and test sets and develop a model that can make reasonably accurate predictions on the test set.

You can download the dataset from here: daily-total-female-births.csv

Post your answer in the comments below. I would love to see what you discover.

More Information

In the next lesson, you will discover how to develop an Encoder-Decoder LSTM network model for multi-step time series forecasting.

Lesson 07: Encoder-Decoder LSTM Multi-step Forecasting

In this lesson, you will discover how to develop an Encoder-Decoder LSTM Network model for multi-step time series forecasting.

We can define a simple univariate problem as a sequence of integers, fit the model on this sequence and have the model predict the next two values in the sequence. We will frame the problem to have 3 inputs and 2 outputs, for example: [10, 20, 30] as input and [40, 50] as output.

The LSTM model expects three-dimensional input with the shape [samples, timesteps, features]. We will define the data in the form [samples, timesteps] and reshape it accordingly. The output must also be shaped this way when using the Encoder-Decoder model.

We will define the number of input time steps as 3 and the number of features as 1 via the input_shape argument on the first hidden layer.

We will define an LSTM encoder to read and encode the input sequences of 3 time steps. The encoded sequence will be repeated 2 times by the model for the two output time steps required by the model using a RepeatVector layer. These will be fed to a decoder LSTM layer before using a Dense output layer wrapped in a TimeDistributed layer that will produce one output for each step in the output sequence.

The model uses the efficient Adam version of stochastic gradient descent and optimizes the mean squared error (‘mse‘) loss function.

Once the model is defined, it can be fit on the training data and the fit model can be used to make a prediction.

The complete example is listed below.

# multi-step encoder-decoder lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
# define dataset
X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([[40,50],[50,60],[60,70],[70,80]])
# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
y = y.reshape((y.shape[0], y.shape[1], 1))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(3, 1)))
model.add(RepeatVector(2))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=100, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

# multi-step encoder-decoder lstm example

from numpy import array

from keras.models import Sequential

from keras.layers import LSTM

from keras.layers import Dense

from keras.layers import RepeatVector

from keras.layers import TimeDistributed

# define dataset

X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])

y = array([[40,50],[50,60],[60,70],[70,80]])

# reshape from [samples, timesteps] into [samples, timesteps, features]

X = X.reshape((X.shape[0], X.shape[1], 1))

y = y.reshape((y.shape[0], y.shape[1], 1))

# define model

model = Sequential()

model.add(LSTM(100, activation='relu', input_shape=(3, 1)))

model.add(RepeatVector(2))

model.add(LSTM(100, activation='relu', return_sequences=True))

model.add(TimeDistributed(Dense(1)))

model.compile(optimizer='adam', loss='mse')

# fit model

model.fit(X, y, epochs=100, verbose=0)

# demonstrate prediction

x_input = array([50, 60, 70])

x_input = x_input.reshape((1, 3, 1))

yhat = model.predict(x_input, verbose=0)

print(yhat)

Running the example will fit the model on the data then predict the next two out-of-sample values.

Given [50, 60, 70] as input, the model correctly predicts [80, 90] as the next two values in the sequence.

Your Task

For this lesson you must download the daily female births dataset, split it into train and test sets and develop a model that can make reasonably accurate predictions on the test set.

You can download the dataset from here: daily-total-female-births.csv

Post your answer in the comments below. I would love to see what you discover.

More Information

The End!
(Look How Far You Have Come)

You made it. Well done!

Take a moment and look back at how far you have come.

You discovered:

The promise of deep learning neural networks for time series forecasting problems.
How to transform a time series dataset into a supervised learning problem.
How to develop a Multilayer Perceptron model for a univariate time series forecasting problem.
How to develop a Convolutional Neural Network model for a univariate time series forecasting problem.
How to develop a Long Short-Term Memory network model for a univariate time series forecasting problem.
How to develop a Hybrid CNN-LSTM model for a univariate time series forecasting problem.
How to develop an Encoder-Decoder LSTM model for a multi-step time series forecasting problem.

This is just the beginning of your journey with deep learning for time series forecasting. Keep practicing and developing your skills.

Take the next step and check out my book on deep learning for time series.

Summary

How Did You Go With The Mini-Course?
Did you enjoy this crash course?

Do you have any questions? Were there any sticking points?
Let me know. Leave a comment below.

181 Responses to How to Get Started with Deep Learning for Time Series Forecasting (7-Day Mini-Course)

Sam September 4, 2018 at 8:50 pm #

Hi Jason,
Love the tutorials, I’m starting to feel as though I understand how to produce my own model.

I’m currently trying to develop an LSTM that analyses a time series dataset of energy consumption, which has a strong seasonal pattern (though the season interval is quite irregular). It consists of around 8 seasonal cycles with about 45000 data points. I would like to produce a model that I can train on this dataset which is able to simulate data of the shape I already have; without walk forward validation (i.e. I would like to be able to predict the next value with the last value of my dataset as input then use the prediction as the input for the next prediction).

Does this seem like a sensible approach? I looked into SARIMAs as well but could not produce a close pattern. I’m fairly new to data science and machine learning and so far have found your tutorials invaluable.

Thanks!

Reply
- Jason Brownlee September 5, 2018 at 6:35 am #
  
  I recommend this process:
  https://machinelearningmastery.com/how-to-develop-a-skilful-time-series-forecasting-model/
  
  Try many methods and many data prep methods and discover what works best. Don’t start with the algorithm, it might be wrong.
  
  Reply
  - Denus November 2, 2024 at 12:55 am #
    
    Hi Jason
    can you explain please this function
    x_input.reshape((1, 3))?
    Why does it need?
    Thanks
    Denis
    
    Reply
    - James Carmichael November 2, 2024 at 6:16 am #
      
      Hi Denus…In the context of deep learning for time series forecasting, reshaping your input data into a specific format is crucial, as deep learning models expect inputs in a particular shape.
      
      The function x_input.reshape((1, 3)) is likely preparing the input data for a neural network model that expects a 2-dimensional array with one row and three columns. Here’s what each component represents:
      
      – **1**: The first dimension represents the number of samples or sequences. In this case, 1 means there’s only a single sequence or time step being fed into the model at a time.
      – **3**: The second dimension represents the number of time steps or features in this sequence. Here, 3 indicates there are three values (features) within this sequence.
      
      ### Why this Reshape is Necessary
      
      For many neural networks, especially those that work with time series data (like LSTMs), the input format is often expected as a **3D array**: (samples, time steps, features). By reshaping into (1, 3), we’re creating a 2D input that can be further reshaped if needed for LSTMs or other models. This reshape aligns with the model’s expectation, ensuring each training step receives the data in a consistent and predictable format.
      
      So, x_input.reshape((1, 3)) ensures the input data is compatible with the model’s structure. If you were to work with multiple sequences or time steps, you would adjust these dimensions accordingly.
      
      Reply
- Gaurav January 22, 2020 at 10:01 pm #
  
  import os
  import cv2
  import pandas as pd
  from sklearn.model_selection import train_test_split
  
  DATADIR = “C://Test/daily-total-female-births.csv”
  
  data = pd.read_csv(DATADIR)
  
  print(data.head())
  
  y = data.Births
  X = data.drop(‘Births’, axis=1)
  
  X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.2)
  print(“\nX_train:\n”)
  print(X_train.head())
  print(X_train.shape)
  
  print(“\nX_test:\n”)
  print(X_test.head())
  print(X_test.shape)
  
  model = Sequential()
  model.add(Dense(100, activation=’relu’, input_dim=3))
  model.add(Dense(1))
  model.compile(optimizer=’adam’, loss=’mse’,metrics=[‘accuracy’])
  
  model.fit(X_train, y, epochs=20000, verbose=0)
  
  Error :
  —————————————————————————
  ValueError Traceback (most recent call last)
  in
  —-> 1 model.fit(X_train, y, epochs=20000)
  
  c:\python\python37\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
  950 sample_weight=sample_weight,
  951 class_weight=class_weight,
  –> 952 batch_size=batch_size)
  953 # Prepare validation data.
  954 do_validation = False
  
  c:\python\python37\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
  749 feed_input_shapes,
  750 check_batch_axis=False, # Don’t enforce the batch size.
  –> 751 exception_prefix=’input’)
  752
  753 if y is not None:
  
  c:\python\python37\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
  136 ‘: expected ‘ + names[i] + ‘ to have shape ‘ +
  137 str(shape) + ‘ but got array with shape ‘ +
  –> 138 str(data_shape))
  139 return data
  140
  
  ValueError: Error when checking input: expected dense_35_input to have shape (3,) but got array with shape (1,)
  
  PLEASE HELP ME OUT!!!!!
  
  Reply
  - Jason Brownlee January 23, 2020 at 6:31 am #
    
    Perhaps confirm that you loaded your dataset correctly?
    
    Reply
litost September 7, 2018 at 10:58 pm #

Lesson 01: my level of understanding of CNN got stuck with dimension issue. However, i am able to perform digit classification with ANN. Accuracy level is arnd 92% as reported in many places.

Looking forward with CNN and +

Reply
- Jason Brownlee September 8, 2018 at 6:07 am #
  
  Thanks.
  
  Reply
Manuel Dias September 10, 2018 at 7:13 pm #

Hi Jason,

I found your post very interesting, since I use alternative algorithms to predict some seasonal data in several scenarios. In these scenarios I use auto ARIMA with R platform and although some results are satisfying, the development platform and the ARIMA tuning process is not very feasible. Thus if I could find a more feasible algorithm and platform, it would be wonderful.

I tested your examples 3 to 5 with some seasonal scenarios (simples sin(x) and linear functions, with seasonal characteristics), but found the predicted results very poor: I tried to change several parameters (inputs with 3, 4 and 5 values; normalizing both input and output values; select other activation/optimizer/loss parameters beside relu/adam options) but the predicted outputs was always a linear function and very far from the expected output. For the same scenarios, the auto ARIMA provides much better predicted results.

I will test with other remaining algorithms (06 and 07) to check for better results: however, if you have any suggestion on how can we improve them, please advice.

Reply
- Jason Brownlee September 11, 2018 at 6:27 am #
  
  Perhaps try tuning the models to your problem?
  Perhaps try seasonal differencing your data first?
  Perhaps try hybrid models?
  
  Reply
komal_123 September 17, 2018 at 2:50 pm #

Love this blog. This blog gives useful information to me. I like this post and thanks for providing. …!!!!!

Reply
- Jason Brownlee September 18, 2018 at 6:09 am #
  
  Thanks, I’m happy that it helps.
  
  Reply
David September 20, 2018 at 8:41 am #

Jason you mentioned that (hint, I have all of the answers directly on this blog, use the search box).
How can I access to answer through search bar. Am I missing something?

Reply
- Jason Brownlee September 20, 2018 at 2:26 pm #
  
  Type in what you need help with, e.g. “LSTM time series”, look through the results, read some of the posts.
  
  Does that help?
  
  Reply

David September 20, 2018 at 9:26 am #

#Lesson 02: How to Transform Data for Time Series

import pandas as pd
data = pd.read_csv("daily-total-female-births.csv")
X_ans=[]
Y_ans=[]
for i in range (len(data["Births"])-2):
    X=list(data["Births"])[i:i+3]
    Y=list(data["Births"])[i+1]
    X_ans.append(X)
    Y_ans.append(Y)
    in_=pd.DataFrame([ str(x) for x in X_ans ],columns=['input'])
    out=pd.DataFrame([ str(x) for x in Y_ans ],columns=['output'])
ans_1=pd.concat([in_,out],axis=1)

import pandas as pd

data = pd.read_csv("daily-total-female-births.csv")

X_ans=[]

Y_ans=[]

for i in range (len(data["Births"])-2):

X=list(data["Births"])[i:i+3]

Y=list(data["Births"])[i+1]

X_ans.append(X)

Y_ans.append(Y)

in_=pd.DataFrame([ str(x) for x in X_ans ],columns=['input'])

out=pd.DataFrame([ str(x) for x in Y_ans ],columns=['output'])

ans_1=pd.concat([in_,out],axis=1)

Jason Brownlee September 20, 2018 at 2:27 pm #

Nice work!

Reply
JC July 7, 2019 at 5:44 am #

Hi David,

I could be wrong, but should the Y=list(data[“Births”])[i+3]? Of course, with adding some restriction so that it won’t give IndexError

Reply
- Charlie April 18, 2020 at 7:55 pm #
  
  Correct. FYI, if you want to download the data directly into python you can use this:
  
  import pandas as pd
  df_url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv?__s=t6njmj1ql2orsiddx0lo’
  df = pd.read_csv(df_url)
  
  Reply
  - Jason Brownlee April 19, 2020 at 5:55 am #
    
    Great tip!
    
    Reply

Harry G. September 30, 2018 at 1:10 am #

Great as always, Jason!

I have a question as regards your last example with the Encoder-Decoder LSTM Multi-step Forecasting: Is it possible to turn it into a category-predicting solution?

I mean, let’s suppose we have an image or sound as input and we want to ouput characters or words that are ecoded as integers. For example:
the input would be x=[0.9,0.8,0.3…]
and the output would be y=[0,1…]

How is that possible? So far I’ve tried changing the loss from ‘mse’ to ‘sparse_categorical_crossentropy’ and the number of outputs of the last Dense layer from 1 to 3 (supposing I want to output two integers from 0 to 2). However, the loss never drops below 1.0986 and of course the model isn’t learning anything. I’ve also tried normalizing the input numbers within a range of 0 to 1, but still nothing. Any ideas? Thanks!

Reply
- Jason Brownlee September 30, 2018 at 6:05 am #
  
  Yes, I explain how here:
  https://machinelearningmastery.com/faq/single-faq/how-can-i-change-a-neural-network-from-regression-to-classification
  
  I also have many examples of sequence classification on the blog for text (sentiment analysis) and activity recognition that may help.
  
  Reply
  - Harry Garrison September 30, 2018 at 8:12 am #
    
    Thanks for the reply!
    I took a look at the link you provided and I slightly changed my code accordingly. It worked better than before and this time the loss really started dropping.
    
    While I was experimenting with a toy dataset I’ve built, two more questions came to mind:
    
    1) Can the time distributed layer be used as some kind of attention mechanism? Keras (to my knowledge) still hasn’t officially implemented an attention mechanism, but I thought that the timedistributed could do the trick. But then again I might be wrong.
    
    2) Is there any substantial difference between using one-hot encoding versus using integers for a multiclass classification problem? I am having trouble implementing the one-hot encoding and I opted for using simple integers to represent classes. Should I force it with one-hot encoding?
    
    Thanks once more for your precious time and help and keep up the good work!
    
    Reply
    - Jason Brownlee October 1, 2018 at 6:22 am #
      
      Well done!
      
      No, time distributed allows you to use a sub-model (automatically) within a broader model.
      
      Yes, no official attention yet, which I think is complete madness. If they don’t get their act together soon the pytorch project is going to overtake (and kill) them.
      
      Yes, I remember classical papers on the topic talking about the onehot/softmax giving the model more flexibility – hence it is a best practice when number of classes is >2. Perhaps try both for your problem and go with what works.
      
      Reply
JG October 8, 2018 at 8:57 pm #

Thks Jason for the tutorial: I think it is a great act of generosity from you !

I am starting Time Series for the first time and I get two main ideas (flavors of Time Series approach), I would like to check out with you, as opposed or vs the “classical” regression/classification that does not care about data time ordering:

1) in time series of univariate (or 1 feature), the SEQUENCE meaning (is the number of inputs), within a set of samples and has a direct correspondence with the features of classical regression/classification approach like this :

number of Features in Regression/Classification == of number of inputs selected within the sequence in Time Series.

therefore the term “features” in Time Series (here only one because of your first univariate approach in this tutorial) has a different value (or meaning) of the term “feature” in equivalent regression/classification approach (here 3 because a sequence of 3 input data)

2) If I change the number of outputs in the output sequence (I think you called it as “multi-step”) this is totally equivalent to multi-categorical classification (for example using the same quantity of output neurons in the output layer (1 for each category) approach.

Do you agree with this first “manual” equivalence between Time Series vs Regression/Classification approach?

Reply
- Jason Brownlee October 9, 2018 at 8:43 am #
  
  Not sure I follow.
  
  Generally, if most models are sequence-unaware, like MLPs. In which case lag obs are features.
  
  Some models are sequence-aware, like RNNs and CNNs. In which case lag obs are handled directly and parallel time series are features.
  
  Multi-step is different from multi-class classification. Same idea though, change the output layer to have n nodes, one for each class, but use a softmax activation.
  
  Does that help?
  
  Reply
JG October 9, 2018 at 8:19 pm #

Tks for your suggestions!

I am trying to summarize (“in terms of tensors in and out”) my previous knowledge of MLP, CNN models vs right now LSTM (RNN models) of TS (time series), applied to approach different problems.

I.e. we used MLP/CNN for classification issues such are as Image processing (e.g. CIFAR-10 for multi-class classification) or multi-class classification (e.g. 3 types of iris flowers) or binary classification (e.g., pimas diabetes y/n), or linear Regression (e.g. continuos Boston Houses pricing).

I mean, I want to represent the input/output ML/DL model process in terms of geometry “tensors” (or best 3D matrices), to get the whole idea of the models processing.
That is to say, in CIFAR-10 I have images input (for training/validation/test) in terms of 3D matrix [samples-rows- , features – columns – X, channels] …so the meaning of features it is clear (the multi-variate X independent variables or pixels of image), and the output matrix is clear for CIFAR-10 [for each sample of image in rows, the Y dependent variable of the class is in columns].

Briefly, tensor input in “pimas” case is [ 768 samples of patients in rows, 8 features or X dependent variables] and the output tensor is [ for each sample or patient in rows, the yes/no diabetes class].
For iris [samples flowers in rows, the 4 features of flowers or X in columns] and the output is [samples in rows of each flower, and Y of the 3 type of iris in columns]. In Boston Prices the input “tensor” is [506 samples of houses, 13 features or X in columns] and the output “tensor” is [for each sample or house in rows, the continuos value of the house or Y].

So, when you talk about TS (time series I do not know if anyone else called it “TS”), for example in the case of “daily birth” case input tensor under MLP model with e.g 7days of week as the time steps of input (in my case), and 1 output day label to be predicted , I can think geometrically in “TS” as input tensor of [sample in rows, time steps (e.g.7 days) in columns – in that case is like X -features- but really they are not !] and output tensor [for each sample in row , the day value of prediction Y].
But Now if I change to CNN model for “TS” applications the input “tensors” must be 3D, so [samples in rows -e.g. 52 weeks in my case-, time steps in cols -e.g.7 days-, and 1 feature] and the output tensor is [for each sample in row , the Y or day of prediction in cols]. It is easy to extrapolate that when we have multiple-steps at the outputs, under this “vision” we would have the correspondent time steps numbers in cols.

Why I talk too much? Here it is my answer. Because for us, the beginners, we lost easily in matrices dimensions and shapes during the models process and, consequently we make mistake because matrices shapes does not match during the model process. And even worst because we lost the whole idea of the model processing, because we are not able to see “geometrically” the case or problem in terms of input “tensors” and output “tensors”, so we get lost easily and then we do not follow next ideas that teachers as you are introducing in the tutorial.

As conclusion, here it is my recommendation for teaching those cases and machine learning models concepts, to try to “visualize” the problems from the beginning , clearly at the problem introduction, in terms of those “input tensors” and “output tensors” shape and meaning…so the next ideas, and subtles that the teacher introduce from the post or tutorial will get much more more easier…at least this is my own experience..

to see always this tensors coming in and coming out within the blackbox of ML/DeepLearning ..:-))

regards,
JG

Reply
- Jason Brownlee October 10, 2018 at 6:10 am #
  
  Thanks for sharing.
  
  Reply

Steph October 25, 2018 at 7:22 pm #

Hi Jason,

Thank you for this tutorial, it looks really helpful!! 🙂

For lesson 02, this is the function I wrote:

def window_transform(dataset, window=2):
	dataX, dataY = [], []
	for i in range(len(dataset)-window):
		print(dataset[i:(i+window)], dataset[(i+window)])
		dataX.append(dataset[i:(i+window)])
		dataY.append(dataset[(i+window)])
	return np.array(dataX), np.array(dataY)

def window_transform(dataset, window=2):

dataX, dataY = [], []

for i in range(len(dataset)-window):

print(dataset[i:(i+window)], dataset[(i+window)])

dataX.append(dataset[i:(i+window)])

dataY.append(dataset[(i+window)])

return np.array(dataX), np.array(dataY)

(heavily inspired by https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/)

Cheers,
Steph

Jason Brownlee October 26, 2018 at 5:34 am #

Nice work!

Reply

Fredrik Kant November 5, 2018 at 7:19 pm #

(My answer to) Lession 01:
I currently are re-reading Simple Complexity (Neil Johnsson) were one of the needed criteries for a Complex Systems is that there exist some kind of feedback, some kind of memory.
The stock market is a Complex System were different actors interact and the stock price is an reflection of that.
So when it it comes to RNN the time capability to create some kind of memory using previous data as feedback I see as benifical.
When it comes to CNN I see the “filters” as benifical. For example in Technical analysis there is a lot of patterns(=filters) that is used to determent resist levels etc. One could perhaps also use different “ok” patterns to detect anomalies in the data. Which in realtime applications (I been working on a trading desk for many years) could be crusial to avoid misstakes.

Reply
- Jason Brownlee November 6, 2018 at 6:29 am #
  
  Interesting, thanks for sharing.
  
  Reply
Volka November 21, 2018 at 12:25 am #

Thanks a lot for the great tutorial. Just wondering why I get the following error when running lesson 5, 6, and 7. Can you please tell me how to fix it?

Using TensorFlow backend.
Traceback (most recent call last):
File “time_series.py”, line 16, in
model.add(LSTM(100, activation=’relu’, input_shape=(3, 1)))
File “C:\Users\Volka\Miniconda2\envs\Tensorflow\lib\site-packages\keras\engine\sequential.py”, line 165, in add
layer(x)
File “C:\Users\Volka\Miniconda2\envs\Tensorflow\lib\site-packages\keras\layers\recurrent.py”, line 532, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File “C:\Users\Volka\Miniconda2\envs\Tensorflow\lib\site-packages\keras\engine\base_layer.py”, line 457, in __call__
output = self.call(inputs, **kwargs)
File “C:\Users\Volka\Miniconda2\envs\Tensorflow\lib\site-packages\keras\layers\recurrent.py”, line 2194, in call
initial_state=initial_state)
File “C:\Users\Volka\Miniconda2\envs\Tensorflow\lib\site-packages\keras\layers\recurrent.py”, line 649, in call
input_length=timesteps)
File “C:\Users\Volka\Miniconda2\envs\Tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py”, line 3011, in rnn
maximum_iterations=input_length)
TypeError: while_loop() got an unexpected keyword argument ‘maximum_iterations’

Reply
- Jason Brownlee November 21, 2018 at 7:52 am #
  
  Are you able to confirm that your TensorFlow and Keras versions are up to date?
  
  Reply
  - Volka November 21, 2018 at 4:14 pm #
    
    Thanks a lot. I updated tensorflow and it worked 🙂
    
    Reply
    - Jason Brownlee November 22, 2018 at 6:20 am #
      
      Nice work!
      
      Reply
    - tgb123 November 23, 2018 at 8:53 pm #
      
      update how much
      
      Reply
Guna December 14, 2018 at 2:51 pm #

Greetings!
Amazing guide for beginners, very useful…

Quick clarification on Method LSTM method:
Why do i get predicted value very less compared with expected i.e 80, while anyone can predict it easily…
Am i missing something!

# fit model
model.fit(X, y, epochs=100, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
print(yhat) — 87.488

When increase the epochs, could see predicted result better then previous to some extent.
epochs=1000 — 82.359
epochs=8000 — 81.217
epochs=10000 — 80.747
epochs=15000 — 82.087

But still i could see epochs=10000 predicted result is not close as expected.
Please guide.

Thank You.
Guna

Reply
- Jason Brownlee December 15, 2018 at 6:08 am #
  
  Generally, LSTMs are not great at time series forecasting and require a lot of tuning.
  
  Perhaps try tuning your model? I have some ideas here:
  https://machinelearningmastery.com/improve-deep-learning-performance/
  
  Reply
Markus December 19, 2018 at 6:31 am #

Using CNN-LSTM model can you please explain where do you get the required input shape from? I don’t get why you do

X = X.reshape((X.shape[0], 2, 2, 1))

Where do you know that from?

Reply
- Jason Brownlee December 19, 2018 at 6:44 am #
  
  Good question, the LSTM and CNN input shape is 3D, I explain more here:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-prepare-my-data-for-an-lstm
  
  Reply
Markus December 19, 2018 at 8:01 am #

Thanks for your reply. You’re saying the LSTM and CNN input shape is 3D. But looking at line:

X = X.reshape((X.shape[0], 2, 2, 1))

It seems to be 4 to me, confused…

Reply
- Jason Brownlee December 19, 2018 at 2:27 pm #
  
  That is for a CNN-LSTM model, not a CNN model.
  
  The CNN-LSTM does have 4D input because it is reading a series of sub-sequences in each sample. More here:
  https://machinelearningmastery.com/cnn-long-short-term-memory-networks/
  
  Reply
Saffin December 20, 2018 at 5:30 pm #

Thanks Jason for the great tutorial !
It gives me the basic idea of how to perform time-series predicting using deep learning.

About the benefits of using CNN / RNN to predict time-series, here is my little thought:

When Convolution operation applied to 1D data, such as time-series,
it’s the same behavior as calculating (weighted) moving average and
therefore CNN can capture series trend with smoother features just like MA methods.(without linearity constraint)

RNN are naturally a sequential data model and can predict current
output based on current input and previous inputs.
LSTM can keep meaningful features generated long ago and still use this memory
to predict current output for a time series.

But I think for time-series data, recent points generally has much more effect on the future points than the points long time ago.
Therefore the capability of LSTM may not fit well on this domain like it fits on other sequential data(ex: NLP).
I guess using LSTM only to predict time-series may affected by the random noise
in data more easily since it may not capture the underlying trend by smoothing.

Reply
- Jason Brownlee December 21, 2018 at 5:27 am #
  
  At the end of the day, I recommend testing a suite of methods and using what gives the best skill.
  
  Reply
Markus December 23, 2018 at 8:56 pm #

Hi

Thanks for this mini-course.

For sure I’m missing a point but what is the reason that this mini course doesn’t say anything about RNN (specially LSTM and GRU) which are apparently used in sequence data modelling and forecasting.

Reply
- Markus December 23, 2018 at 9:02 pm #
  
  What I mean is about this course: https://machinelearningmastery.com/time-series-forecasting-python-mini-course/
  
  Reply
  - Jason Brownlee December 24, 2018 at 5:27 am #
    
    That is an introductory course (e.g. linear methods), not a deep learning for time series course.
    
    Further, LSTMs perform very poorly for univaraite data.
    
    Reply
- Jason Brownlee December 24, 2018 at 5:27 am #
  
  It does, 3 of the days focus on the LSTMs (days 5, 6 and 7).
  
  Perhaps re-read the post?
  
  Reply
Dimitre Oliveira December 27, 2018 at 1:09 pm #

Great article Jason,

I’m really liking the content on time series data, I took this course and applied it to a slightly more complex problem, a store/item sales dataset of a competition on Kaggle, I made a few modifications on the code and wrote a kernel, if anyone wanna take a look and leave a feed back, check out https://www.kaggle.com/dimitreoliveira/deep-learning-for-time-series-forecasting

Reply
- Jason Brownlee December 28, 2018 at 5:49 am #
  
  Well done!
  
  Reply

Venkataramanan February 5, 2019 at 9:05 pm #

def create_dataset(dataset,n_in=1, n_out=1, dropnan=True):
    
    n_vars = 1 if type(dataset) is list else dataset.shape[1]
    col_data, names = list(), list()
    for i in range(n_in, 0, -1):
        col_data.append(dataset.shift(i))
        if i>1:
            names += [('X%d' % (j+1)) for j in range(n_vars)]
        else:
            names +=['X']
        
    for i in range(0, n_out):
        col_data.append(dataset.shift(-i))
        if i == 0:
            names += ['y']
        else:
            names += [('y%d(t+%d)' % (j+1, i)) for j in range(n_vars)]
    
    print(names)
    agg = pd.concat(col_data, axis=1)
    agg.columns = names
    
    if dropnan:
        agg.dropna(inplace=True)
        
    return agg

def create_dataset(dataset,n_in=1, n_out=1, dropnan=True):

n_vars = 1 if type(dataset) is list else dataset.shape[1]

col_data, names = list(), list()

for i in range(n_in, 0, -1):

col_data.append(dataset.shift(i))

if i>1:

names += [('X%d' % (j+1)) for j in range(n_vars)]

else:

names +=['X']

for i in range(0, n_out):

col_data.append(dataset.shift(-i))

if i == 0:

names += ['y']

else:

names += [('y%d(t+%d)' % (j+1, i)) for j in range(n_vars)]

print(names)

agg = pd.concat(col_data, axis=1)

agg.columns = names

if dropnan:

agg.dropna(inplace=True)

return agg

Jason Brownlee February 6, 2019 at 7:41 am #

Nice work.

Reply

Lindsay Moir February 26, 2019 at 1:03 am #

#!/usr/bin/env python
# coding: utf-8

# In[167]:


from keras.models import Sequential
from keras.layers import Dense
import matplotlib as plt
from matplotlib import pyplot
import seaborn as sns
sns.set_style('whitegrid')
sns.set_palette("bright", 10)
from numpy import array
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error as mse


# In[168]:


# Load births dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv'
df = pd.read_csv(url)
# Verify
print(len(df))
df.head()


# In[169]:


df.hist();


# You do not need the date. This is sequential, each and every day. Numbers will be integer.

# In[170]:


births = df['Births'].tolist()
# Verify
print(births[:5])


# In[171]:


# Transform to windowed array
# Will return n elements PLUS the dependent variable for a total of n+1 elements.
# This a list of lists that will be turned into a numpy matrix. It is a sliding window.
window_size = 3
array = []
[array.append(births[index:index+window_size+1]) for index in range(len(births)-(window_size))]
array = np.array(array)
# Verify
print(array)
print('array length =', array.shape)
print(df.head())
df.tail()


# In[172]:


# We can not use model_selection.train_test_split BECAUSE we need to keep the date order intact. 
# Since this model uses a random seed it gets rid of the date order.
# Just split up the sets simply.
test_size = .7
length = array.shape[0]
nof_rows = int(test_size * length)
X_train = array[:nof_rows, :window_size]
X_validation = array[nof_rows:, :window_size]
Y_train = array[:nof_rows, -1]
Y_validation = array[nof_rows:, -1]


# In[173]:


# Verify
print('X_train set first 5 rows are \n', X_train[:5])
print('X_train shape is', X_train.shape)
print('Y_train set first 5 items are \n', Y_train[:5])
print('Y_train shape is', Y_train.shape)
print('X_validation set first 5 rows are \n', X_validation[:5])
print('X_validation shape is', X_validation.shape)
print('Y_validation set first 5 rows are \n', Y_validation[:5])
print('Y_validation shape is', Y_validation.shape)


# In[174]:


# Create model
model = Sequential()
model.add(Dense(100, input_dim=window_size, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')


# In[175]:


# Fit model
model.fit(X_train, Y_train, epochs=500, verbose=0);


# In[176]:


# Prediction on training set vs actual training dependent variable
rows = X_train.shape[0]
x_input = X_train.reshape(rows, window_size)
training_predictions = model.predict(x_input, verbose=0)
# Verify
print(training_predictions[:3])
print(Y_train[:3])


# In[177]:


# Create dataframe for plotting and analysis
dftc = pd.DataFrame(training_predictions, Y_train).reset_index()
dftc.head()


# In[178]:


# It called the training_actual_births 'index' and the training_predictions '0'. Fix column names.
dftc.columns = ['training_actual_births', 'training_predictions']
# Find absolute error and percent
dftc['delta_error'] = (dftc['training_predictions'] - dftc['training_actual_births']).abs()
dftc['percent_error'] = (dftc['delta_error'] / dftc['training_actual_births']) * 100
t_mean = dftc['percent_error'].mean()
# Verify
print(len(dftc))
print(t_mean)
dftc.head(5)


# In[179]:


# plot, blue is actual, red is predicted
pyplot.figure(figsize=(16, 4))
pyplot.plot(dftc['training_actual_births'])
pyplot.plot(dftc['training_predictions'], color='red')
pyplot.show()


# In[180]:


dftc['percent_error'].hist();


# In[181]:


# Prediction on validaton set vs actual validation dependent variable
rows = X_validation.shape[0]
x_input = X_validation.reshape(rows, window_size)
validation_predictions = model.predict(x_input, verbose=0)
# Verify
print(validation_predictions[:3])
print(Y_validation[:3])


# In[182]:


# Create dataframe for plotting and analysis
dfvc = pd.DataFrame(validation_predictions, Y_validation).reset_index()
dfvc.head()


# In[183]:


# It called the training_actual_births 'index' and the training_predictions '0'. Fix column names.
dfvc.columns = ['validation_actual_births', 'validation_predictions']
# Find absolute error and percent
dfvc['delta_error'] = (dfvc['validation_predictions'] - dfvc['validation_actual_births']).abs()
dfvc['percent_error'] = (dfvc['delta_error'] / dfvc['validation_actual_births']) * 100
v_mean = dfvc['percent_error'].mean()
print(v_mean)
print(len(dfvc))
dfvc.head(5)


# In[184]:


# plot, blue is actual, red is predicted
pyplot.figure(figsize=(16, 4))
pyplot.plot(dfvc['validation_actual_births'])
pyplot.plot(dfvc['validation_predictions'], color='red')
pyplot.show()


# In[185]:


dfvc['percent_error'].hist();


# In[186]:


# Print the Mean Average Error and the Root Mean Square Error
root_mean_square_error = np.sqrt(mse(dfvc['validation_actual_births'], dfvc['validation_predictions'])) 
print('There was a', round(t_mean, 2), '% mean average error for the training set.')
print('There was a', round(v_mean, 2), '% mean average error for the validation set.')
print('There was a', round(dfvc['delta_error'].mean(), 2), 'mean average error for the validation set.')
print('There was a', round(root_mean_square_error, 2), 'root mean square error for the validation set.')


# In[ ]:

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

#!/usr/bin/env python

# coding: utf-8

# In[167]:

from keras.models import Sequential

from keras.layers import Dense

import matplotlib as plt

from matplotlib import pyplot

import seaborn as sns

sns.set_style('whitegrid')

sns.set_palette("bright", 10)

from numpy import array

import numpy as np

import pandas as pd

from sklearn.metrics import mean_squared_error as mse

# In[168]:

# Load births dataset

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv'

df = pd.read_csv(url)

# Verify

print(len(df))

df.head()

# In[169]:

df.hist();

# You do not need the date. This is sequential, each and every day. Numbers will be integer.

# In[170]:

births = df['Births'].tolist()

# Verify

print(births[:5])

# In[171]:

# Transform to windowed array

# Will return n elements PLUS the dependent variable for a total of n+1 elements.

# This a list of lists that will be turned into a numpy matrix. It is a sliding window.

window_size = 3

array = []

[array.append(births[index:index+window_size+1]) for index in range(len(births)-(window_size))]

array = np.array(array)

# Verify

print(array)

print('array length =', array.shape)

print(df.head())

df.tail()

# In[172]:

# We can not use model_selection.train_test_split BECAUSE we need to keep the date order intact.

# Since this model uses a random seed it gets rid of the date order.

# Just split up the sets simply.

test_size = .7

length = array.shape[0]

nof_rows = int(test_size * length)

X_train = array[:nof_rows, :window_size]

X_validation = array[nof_rows:, :window_size]

Y_train = array[:nof_rows, -1]

Y_validation = array[nof_rows:, -1]

# In[173]:

# Verify

print('X_train set first 5 rows are \n', X_train[:5])

print('X_train shape is', X_train.shape)

print('Y_train set first 5 items are \n', Y_train[:5])

print('Y_train shape is', Y_train.shape)

print('X_validation set first 5 rows are \n', X_validation[:5])

print('X_validation shape is', X_validation.shape)

print('Y_validation set first 5 rows are \n', Y_validation[:5])

print('Y_validation shape is', Y_validation.shape)

# In[174]:

# Create model

model = Sequential()

model.add(Dense(100, input_dim=window_size, activation='relu'))

model.add(Dense(1))

model.compile(optimizer='adam', loss='mse')

# In[175]:

# Fit model

model.fit(X_train, Y_train, epochs=500, verbose=0);

# In[176]:

# Prediction on training set vs actual training dependent variable

rows = X_train.shape[0]

x_input = X_train.reshape(rows, window_size)

training_predictions = model.predict(x_input, verbose=0)

# Verify

print(training_predictions[:3])

print(Y_train[:3])

# In[177]:

# Create dataframe for plotting and analysis

dftc = pd.DataFrame(training_predictions, Y_train).reset_index()

dftc.head()

# In[178]:

# It called the training_actual_births 'index' and the training_predictions '0'. Fix column names.

dftc.columns = ['training_actual_births', 'training_predictions']

# Find absolute error and percent

dftc['delta_error'] = (dftc['training_predictions'] - dftc['training_actual_births']).abs()

dftc['percent_error'] = (dftc['delta_error'] / dftc['training_actual_births']) * 100

t_mean = dftc['percent_error'].mean()

# Verify

print(len(dftc))

print(t_mean)

dftc.head(5)

# In[179]:

# plot, blue is actual, red is predicted

pyplot.figure(figsize=(16, 4))

pyplot.plot(dftc['training_actual_births'])

pyplot.plot(dftc['training_predictions'], color='red')

pyplot.show()

# In[180]:

dftc['percent_error'].hist();

# In[181]:

# Prediction on validaton set vs actual validation dependent variable

rows = X_validation.shape[0]

x_input = X_validation.reshape(rows, window_size)

validation_predictions = model.predict(x_input, verbose=0)

# Verify

print(validation_predictions[:3])

print(Y_validation[:3])

# In[182]:

# Create dataframe for plotting and analysis

dfvc = pd.DataFrame(validation_predictions, Y_validation).reset_index()

dfvc.head()

# In[183]:

# It called the training_actual_births 'index' and the training_predictions '0'. Fix column names.

dfvc.columns = ['validation_actual_births', 'validation_predictions']

# Find absolute error and percent

dfvc['delta_error'] = (dfvc['validation_predictions'] - dfvc['validation_actual_births']).abs()

dfvc['percent_error'] = (dfvc['delta_error'] / dfvc['validation_actual_births']) * 100

v_mean = dfvc['percent_error'].mean()

print(v_mean)

print(len(dfvc))

dfvc.head(5)

# In[184]:

# plot, blue is actual, red is predicted

pyplot.figure(figsize=(16, 4))

pyplot.plot(dfvc['validation_actual_births'])

pyplot.plot(dfvc['validation_predictions'], color='red')

pyplot.show()

# In[185]:

dfvc['percent_error'].hist();

# In[186]:

# Print the Mean Average Error and the Root Mean Square Error

root_mean_square_error = np.sqrt(mse(dfvc['validation_actual_births'], dfvc['validation_predictions']))

print('There was a', round(t_mean, 2), '% mean average error for the training set.')

print('There was a', round(v_mean, 2), '% mean average error for the validation set.')

print('There was a', round(dfvc['delta_error'].mean(), 2), 'mean average error for the validation set.')

print('There was a', round(root_mean_square_error, 2), 'root mean square error for the validation set.')

# In[ ]:

Jason Brownlee February 26, 2019 at 6:24 am #

Well done.

Reply

Gizo April 8, 2019 at 11:42 pm #

Thanks Jason for the tutorial.
i am using Matlab for MLP time-series data set.
can you please help how do i go through?

Reply
- Jason Brownlee April 9, 2019 at 6:26 am #
  
  Sorry, I don’t have any tutorials on matlab.
  
  Reply
Tayyaba Fatima May 22, 2019 at 4:16 am #

informative article thanks for sharing.

Reply
- Jason Brownlee May 22, 2019 at 8:12 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Drew July 9, 2019 at 2:44 pm #

Really happy to see this post as I’ve been looking for material on neural networks for time series, and wasn’t sure about the one I built not fully knowing the basics, so it’s great to have the step by step. Also nice to suggest putting the answers in the comments as they are helpful. Read a few other posts from you as well while learning the subject, thank you much for the contributions.

Reply
- Jason Brownlee July 10, 2019 at 7:57 am #
  
  Thanks Drew, I hope it helps and you have some fun!
  
  Reply
Ryan July 28, 2019 at 8:15 am #

Have you tried the birth data exercises yourself Jason?

They don’t seem like a great example. Almost all of the variance in the data is random. A best case R square would be something like .10, regardless of the model you use.

Reply
- Jason Brownlee July 29, 2019 at 5:56 am #
  
  Yes, I have few tutorials using the data on the blog.
  
  Reply
jack collin August 6, 2019 at 5:45 pm #

hi, thanks for mini course.
i have question. can use CNN-LSTM for Time Series Multi step Forecasting? and if your answer is yes, is it explained in your book?

Reply
- Jason Brownlee August 7, 2019 at 7:42 am #
  
  Yes, i give a few examples in the book, also there is an example in this post:
  https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
  
  Reply
Livia Parente August 21, 2019 at 11:03 pm #

Thank you for the course, it’s very didactic!

Here is my answer to lesson 3:

import pandas as pd import numpy as np
data = pd.read_csv("https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv") columns = 30 rows = data.shape[0] - columns y = np.zeros((rows,1)) x = np.zeros((rows, columns)) for i in range(rows): x[i,:] = data['Births'].iloc[i:i+columns] y[i] = data['Births'].iloc[i+columns]

Reply
- Jason Brownlee August 22, 2019 at 6:27 am #
  
  Nice work!
  
  Reply
Atul September 8, 2019 at 2:28 pm #

HI Jason
Thanks for this simplified version of TSF methods, it is really helpful. Can you please provide a suggestion on a scenario im working on. I have TSF problem where i need to predict values at hourly rate for the next 3/4 months. I have hourly data for past 6 months. Going by above examples, the size of Y (ouput) vector will be huge. How can this be simplified

Reply
- Jason Brownlee September 9, 2019 at 5:12 am #
  
  I recommend following this framework:
  https://machinelearningmastery.com/how-to-develop-a-skilful-time-series-forecasting-model/
  
  Reply
Julian September 17, 2019 at 7:20 pm #

Hi Jason,

thanks for your work!

I calculated the mse for the testing data and the variance and got a lower variance for the lstm and cnn-lstm so a constant linear function from the mean of the test_y would discripe the data in a better way then our nns right?

Do you have the same experience?

Reply
- Jason Brownlee September 18, 2019 at 5:59 am #
  
  Sorry, I don’t follow. What results did you get exactly?
  
  Reply
Alejandro September 25, 2019 at 5:19 am #

Hi Jason!!
Thank you for the course, it was very easy to follow and helpful. My model try to predict the births of an specific day looking at the values of the previous ones. To measure the prediction accuracy I used the RMSE but I get a similar error for all the models (value of 6 aprox). I tried to modify the hiperarametres but without any improvement. Is it because of randomness of the data base?

Reply
- Jason Brownlee September 25, 2019 at 6:06 am #
  
  If you cannot get better than persistence with a range of models, it might be the case that there is no pattern to learn?
  
  Perhaps explore additional models?
  Perhaps explore addition data transforms?
  Perhaps explore additional framings of the problem?
  
  Reply
Vivaka Nand October 9, 2019 at 6:32 pm #

Hi Jason Brownlee, thank you for informative articles.

i have question. have you written an article on CNN-LSTM classifier for Time Series Uni-variate Forecasting? if you have, please share the link..

Reply
- Jason Brownlee October 10, 2019 at 6:54 am #
  
  Yes, I have many.
  
  Perhaps start with the simple examples here:
  https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
  
  Reply
AlexNage November 1, 2019 at 5:34 am #

hello Jason, Thank you for all the effort you do and for the informative tutorials.

I am wondering if you have a complete tutorial for multivariate time series classification( not regression). if you have can you share the link please?

I am new to time series forecasting and I have a data set measure over 3 months for around 50 variables. and the output is binary either 1 or 0 (failure or no failure)

I want to predict if there will be a failure in the next 7 days.
I am struggling with the data pre-processing and the sliding window. can you help me please?

Many Thanks,

Reply
- Jason Brownlee November 1, 2019 at 5:43 am #
  
  Yes, see the human activity recognition tutorials here:
  https://machinelearningmastery.com/start-here/#deep_learning_time_series
  
  Reply
  - AlexNage November 2, 2019 at 12:31 am #
    
    hello Jason, Thanks for the reply.
    
    so I looked at the tutorial but my concern is that in the tutorial the data is already pre-processed.
    
    so in my case I want to predict the failure of a machine in the next 7 days. I have data recorded per day for thousands of different machines. i have a separate file for each day.. and each file contains thousand of observation for thousand of different unique machine. so in the single file you won’t find more than one observation for the same machine. then i have a label for weather the machine failed on this day or not.
    
    so my question is, Should I concatenate of bind all the files together so that I will have a data frame of measurements for a period of 1 month for example? then I transform it?
    
    and what will be the size of my window? is it something that I should shape it? or it’s only parameters that I pass to the model (like n_timesteps and n_features)
    I am really confused in the concept of window size and how to formulate the suitable window size. can you advice me please?
    
    Many Thanks,
    
    Reply
    - Jason Brownlee November 2, 2019 at 6:46 am #
      
      Your problem sounds like a time series classification task.
      
      Yes, the data will have to be prepared for your preferred framing. You can choose how you want to prepare it, and write custom code to load it – e.g. a data generator.
      
      I recommend testing a different sized windows, try learning per machine or across machines, etc. try many different framings of the problem in order to discover what works best for your specific dataset.
      
      Reply
      - AlexNage November 2, 2019 at 7:59 am #
        
        I am sorry but I am not sure what you mean by a data generator. Can you explain more?
        
        Thanks alot
      - Jason Brownlee November 3, 2019 at 5:41 am #
        
        Sure, more on generators in general here:
        https://wiki.python.org/moin/Generators
        
        I have a number of examples on the blog, perhaps see the example under “progressive loading” here:
        https://machinelearningmastery.com/develop-a-deep-learning-caption-generation-model-in-python/
Thomas Fitzpatrick February 1, 2020 at 2:39 am #

In the case of CNN vs RNN in time series work, it looks as if RNN may have the edge. RNN is fed data then determines an outcome, and continues to do this. Then, it can be improved with LSTM remembering things that have happened in the past and finding patterns across time to make its next guesses better.

Reply
- Jason Brownlee February 1, 2020 at 5:58 am #
  
  The model that works well/best varies from problem to problem.
  
  Reply
Somesh Kumar Yadav April 2, 2020 at 11:41 pm #

1. I think that the CNN way of using different filters for learning different factors can be used in time series forecasting to learn different aspects of Time series sequence like seasonality, patterns at different time stages.

2. I think RNN seems to be the best model for Time series sequence as it takes previous state as input which signifies that the current state of a time series depends on previous states that may be true and and it also takes input at each state which signifies the various factors that affect the time sequence but i know it has some problems like Gradient Exploding etc.

Reply
- Jason Brownlee April 3, 2020 at 6:54 am #
  
  Nice work.
  
  Reply
Lean April 17, 2020 at 3:00 am #

Lección 02: Cómo transformar datos para series temporales

import numpy
import pandas

def create_dataset(dataset, ventana):
dataX, dataY = [], []
for i in range(len(dataset)-ventana-1):
a = dataset[i:(i+ventana), 0]
dataX.append(a)
dataY.append(dataset[i + ventana, 0])
return numpy.array(dataX), numpy.array(dataY)

# load the dataset
dataframe = pandas.read_csv(‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv’, usecols=[1], engine=’python’)
dataset = dataframe.values
dataset = dataset.astype(‘float32’)
c_dataset = create_dataset (dataset, 5)
print (c_dataset)

Reply
- Jason Brownlee April 17, 2020 at 6:26 am #
  
  Well done!
  
  Reply
Daria April 27, 2020 at 5:52 pm #

Comment for lesson 1: Convoluted Neural networks automatically extract the features from the input provided and if we think about some sort of analogy between multi-variate inputs and 2D or 3D images, they have the capability to handle multi-dimensional inputs. For RNNs the capabilities to handle noise and missing values, learning non-linear dependencies, and also they can handle multi-variate inputs.

Reply
- Jason Brownlee April 28, 2020 at 6:43 am #
  
  Well done!
  
  Reply
Daria April 27, 2020 at 6:30 pm #

Lesson 2 input preparation func:

def prepare_ts_input(ts_data, window):
ts_df = pd.DataFrame(columns = [‘x’, ‘y’])
for i in range(0, ts_data.shape[0]-window, window):
row = [ts_data[i:i+window].values, ts_data[i+window]]
ts_df.loc[len(ts_df)] = row
return ts_df

Reply
- Jason Brownlee April 28, 2020 at 6:44 am #
  
  Well done!
  
  Reply
Daria April 28, 2020 at 12:26 am #

For the lesson2 – prepare the input data set

def prepare_ts_input(ts_data, window):
ts_df = pd.DataFrame(columns = [‘x’, ‘y’])
for i in range(0, ts_data.shape[0]-window):
row = [ts_data[i:i+window].to_list(), ts_data[i+window]]
ts_df.loc[len(ts_df)] = row
return ts_df

Reply
- Jason Brownlee April 28, 2020 at 6:48 am #
  
  Great work!
  
  Reply
Syed Nazir May 18, 2020 at 6:15 am #

I have used sliding window size = 5. My lesson 2 answer is: 42.53852

Reply
- Jason Brownlee May 18, 2020 at 6:24 am #
  
  Well done!
  
  Reply
giovanna May 26, 2020 at 5:29 am #

import os
import csv
import pandas as pd

df = pd.read_csv(“https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv”)

date=df[“Date”]

births=df[“Births”]

x= births[0:200]

y= births[200]

Reply
- Jason Brownlee May 26, 2020 at 6:31 am #
  
  Well done!
  
  Reply
BN June 14, 2020 at 1:16 am #

Hi Jason,
Lesson 2:
I’m lazy and I prefer numpy.
I took your super routine:

def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
# from https://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/

and so with .tolist() my code

from numpy import genfromtxt
my_data = genfromtxt(‘femtotal.csv’, delimiter=’,’,skip_header=1)
# drop datum
my_data = my_data[:,1]
print(my_data)
print(my_data.shape)
erg = series_to_supervised(my_data.tolist())
print(erg.head())

Reply
- Jason Brownlee June 14, 2020 at 6:35 am #
  
  Nice work!
  
  Reply
BN June 17, 2020 at 7:36 pm #

Hi Jason,

Lesson 3:
daily-total-female-birth.csv

Input/Lay1/Lay2/Lay3 epoche train test MSE_train MSE_test
3/3/12/1 400 2/3 1/3 50,70 49,98
3/3/8/1 400 2/3 1/3 55,70 49,53
3/12/12/1 400 2/3 1/3 50,29 46,68
3/12/8/1 300 2/3 1/3 55,51 51,72
3/12/8/1 400 2/3 1/3 50,53 50,64
3/12/8/1 500 2/3 1/3 53,04 51,32

I have a fundamental problem with reproducibility of the MSEs.
Is it in keras a random number (without seed)?

Thanks
Béla

Reply
- Jason Brownlee June 18, 2020 at 6:22 am #
  
  It is to be expected, see this:
  https://machinelearningmastery.com/faq/single-faq/why-do-i-get-different-results-each-time-i-run-the-code
  
  Reply
Massimiliano Porzio August 17, 2020 at 7:27 pm #

LEsson 1:
Both RNN and CNN can learn long.term pattersn beacause of their ability to use “memory” state (RNN) and to see the patterns thourgh filters (CNN) from short to long term sequences.

Reply
- Jason Brownlee August 18, 2020 at 6:02 am #
  
  Nice work.
  
  Reply
AZI September 10, 2020 at 7:47 am #

helo,
can you please write your email?
i have some important question about LSTM

thank you

Reply
- Jason Brownlee September 10, 2020 at 1:34 pm #
  
  You can contact me any time right here:
  https://machinelearningmastery.com/contact/
  
  Reply
Amir October 6, 2020 at 3:56 am #

Hi Jason,
Many thanks for this great tutorial
Do you have any post on time series prediction with transformers and attention mechanism?
Thanks you

Reply
- Jason Brownlee October 6, 2020 at 7:00 am #
  
  Not at this stage.
  
  Reply
daniele baranzini December 5, 2020 at 4:32 am #

Lesson 1 comment:

a property of RNN or CNN for time series is their neutral stance towards the i.i.d assuption.

Apparently they can treat correlated or independent observations without derailing the model skill.

(interestingly, examples of ML applications braking the iid violation…is still common though )

Reply
- Jason Brownlee December 5, 2020 at 8:11 am #
  
  Well done.
  
  Reply
Ramesh Ravula January 23, 2021 at 9:50 pm #

Here are the answers to lesson 1:
(1) Recurrent NNs are beneficial for time series forecasting problems because they allow to make reliable predictions on time series data even though the data is sequential. RNNs exhibit similar behavior to how human brains function. They are robust to noise.
(2) CNNs – the ability of CNNs to learn and automatically extract features from raw input data can be applied to time series forecasting problems.

Reply
- Jason Brownlee January 24, 2021 at 5:59 am #
  
  Well done!
  
  Reply
Ramesh Ravula January 25, 2021 at 9:18 pm #

Here us the answer to lesson 2:

How does one convert to a time series data? Do you have the right code and can I look at it?

# load and summarize the dataset
from pandas import read_csv
from sklearn.model_selection import train_test_split
# load the dataset
url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv’
df = read_csv(url, header=None)
# retrieve the array
data = df.values
# split into input and output elements
X, y = data[:, :1], data[:, 1]
# summarize the shape of the dataset
print(X)
print(y)
print(X.shape, y.shape)

Reply
- Jason Brownlee January 26, 2021 at 5:53 am #
  
  Well done.
  
  Reply
Ramesh Ravula January 27, 2021 at 7:54 pm #

Here is the answer to lesson 3:

# univariate mlp example
from numpy import array
from pandas import read_csv
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
# define dataset

url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv’
df = read_csv(url, header=None)
# retrieve the array
data = df.values
# split into input and output elements
X, y = data[:, :1], data[:, 1]
# summarize the shape of the dataset
print(X)
print(y)
print(X.shape, y.shape)

# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
# summarize the shape of the train and test sets
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([40, 50, 60, 70])
# define model
model = Sequential()
model.add(Dense(100, activation=’relu’, input_dim=3))
model.add(Dense(1))
model.compile(optimizer=’adam’, loss=’mse’)
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3))
yhat = model.predict(x_input, verbose=0)
print(yhat)

Reply
- Jason Brownlee January 28, 2021 at 5:55 am #
  
  Well done!
  
  Reply
Ramesh Ravula January 28, 2021 at 7:49 pm #

Lesson 4:
I am getting the following attribute error: ‘list’ object has no attribute ‘reshape’ when trying to predict ‘yhat’. I didn’t want to include the program since it did not run successfully. Trying to fix the bug.

Reply
- Jason Brownlee January 29, 2021 at 6:01 am #
  
  Sorry to hear that, perhaps these tips will help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Ramesh Ravula January 28, 2021 at 8:33 pm #

I was able to fix the bug.Here is Lesson 4:

# univariate cnn example
from numpy import array
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from pandas import read_csv
from sklearn.model_selection import train_test_split
# load the dataset
url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv’
df = read_csv(url, header=None)
# retrieve the array
data = df.values
# split into input and output elements
X, y = data[:, :1], data[:, 1]
# summarize the shape of the dataset
print(X.shape, y.shape)
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
# summarize the shape of the train and test sets
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

# define dataset
X = array([[35,32,30], [32,30,31], [30,31,44], [31,44,29]])
y = array([[31,44,29,45]])

# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the sequence
if end_ix > len(sequence)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)

# define input sequence
raw_seq = [35, 32, 30, 31, 44, 29, 45, 43, 38]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
#print(X)
#print(y)

# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))

# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation=’relu’, input_shape=(3, 1)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation=’relu’))
model.add(Dense(1))
model.compile(optimizer=’adam’, loss=’mse’)
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([44, 29, 45])
#print(x_input)
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

Reply
Ramesh Ravula January 28, 2021 at 8:35 pm #

Here is lesson 5:

# univariate lstm example
from numpy import array
from keras.models import Sequential
from pandas import read_csv
from keras.layers import LSTM
from keras.layers import Dense

# load the dataset
url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv’
df = read_csv(url, header=None)
# retrieve the array
data = df.values
# split into input and output elements
X, y = data[:, :1], data[:, 1]

# summarize the shape of the dataset
print(X.shape, y.shape)
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
# summarize the shape of the train and test sets
#print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

# define dataset
X = array([X_train])
y = array([y_train])
#X = array([[35,32,30], [32,30,31], [30,31,44], [31,44,29]])
#y = array([[31,44,29,45]])

# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the sequence
if end_ix > len(sequence)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)

# define input sequence
raw_seq = [35, 32, 30, 31, 44, 29, 45, 43, 38]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
#print(X)
#print(y)

# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
print(X)

# define model
model = Sequential()
model.add(LSTM(50, activation=’relu’, input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer=’adam’, loss=’mse’)
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([44, 29, 45])
#x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

Reply
Joao Souza Neto February 1, 2021 at 8:08 am #

Dear Jason,
I’ve bought many books from you, and I am quite satisfied with them.
Currently, I am involved with Deep Learning, and I am looking for a book that shows how to do implement DL with WEKA.
Do you have any material on that?
Best
Souza

Reply
- Jason Brownlee February 1, 2021 at 9:06 am #
  
  Thanks!
  
  Sorry, I don’t have material on DL in Weka.
  
  Reply
Naveen April 8, 2021 at 3:13 am #

For lesson 1: The capability from both CNN and RNN is the learning of mapping function for inputs over the times to the output may be beneficial for time-series forecasting.

Reply
- Jason Brownlee April 8, 2021 at 5:11 am #
  
  Nice work.
  
  Reply
Hussain April 22, 2021 at 12:56 pm #

from pandas import read_csv

Lesson 2, the easiest way to split data in X and y.

df = read_csv(‘daily-total-female-births-CA.csv’, header=None)

data = df.values

# input output data

X, y = data[:, :1], data[:, 1]

X.shape

Reply
- Jason Brownlee April 23, 2021 at 4:57 am #
  
  Nice work.
  
  Reply
llinet April 28, 2021 at 1:34 am #

trying to answer the first lesson homework. i found this paper with an interesting perspective about time series. i pointed out it that i think is the answer.

learn multiple discriminative features:
“an intuition behind applying convolution several filters on an input time series would be to learn multiple discriminative features useful for the classification task”

Ismail Fawaz, H., Forestier, G., Weber, J. et al. Deep learning for time series classification: a review. Data Min Knowl Disc 33, 917–963 (2019). https://doi.org/10.1007/s10618-019-00619-1

please forgive my english

Reply
- Jason Brownlee April 28, 2021 at 6:03 am #
  
  Well done!
  
  Reply
llinet April 30, 2021 at 6:13 am #

import pandas as pd
import numpy as np

#read file
df = pd.read_csv(“daily-total-female-births.csv”)

#get the serie
serie = df[“Births”].values

#the lag is the variable “step”
def create_seq(serielista,step):
lista = []
for i in range(0,len(serielista)-step):
lista.append(serielista[i:i+step])
return lista

#create the seq
create_seq(serie,4)

Reply
- Jason Brownlee May 1, 2021 at 5:58 am #
  
  Well done.
  
  Reply
llinet April 30, 2021 at 6:57 am #

lesson 3: i get an error of 39.07 using a 80 % for training and a 20 % for test

from keras.models import Sequential
from keras.layers import Dense
# define dataset
X = df.loc[:289,0:2]
y = df.loc[:289,3]
X_test = df.loc[289:,0:2]
y_test = df.loc[289:,3]
# define model
model = Sequential()
model.add(Dense(100, activation=’relu’, input_dim=3))
model.add(Dense(1))
model.compile(optimizer=’adam’, loss=’mse’)
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction

yhat = model.predict(X_test, verbose=0)
print(yhat)

Reply
- Jason Brownlee May 1, 2021 at 5:58 am #
  
  Nice work.
  
  Reply
Prashant Katiyar May 6, 2021 at 5:46 pm #

Hi Jason I am not sure whether it’s a appropriate forum to ask the question or not but want your suggestions on some issues. So i am working on a timeseries data for a travel and lifestyle call centre since covid effected this industry very badly, and now some how it’s returning back on track but 2020 is somehow kind of a outlier what are some ways to treat this data so that the accuracy of model remains good

Reply
- Jason Brownlee May 7, 2021 at 6:25 am #
  
  This might give you some ideas:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-model-anomaly-detection
  
  Reply
Robert July 19, 2021 at 8:21 pm #

Dear Jason

I was inspired by your materials and those of Kaggle. Very good lessons for beginners.

Lesson 2 First we need to download the file from the URL:

import requests

path = ‘daily-total-female-births.csv’
link = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv?__s=ot1xapc4zxmlwkxq2v86&utm_source=drip&utm_medium=email&utm_campaign=DLFTSF+Mini-Course&utm_content=Day+2%3A+How+to+Transform+Data+for+Time+Series’

try:
data = requests.get(link).text
with open(path, ‘w’) as output:
output.write(data)

print(‘File downloaded :)’)
except ValueError:
print(‘Could not download file !!’)

Now we can continue the task from lesson 2:

import pandas
from numpy import array

path = ‘daily-total-female-births.csv’

# loads only column with index 1 from daily-total-female-births.csv file
dataframe = pandas.read_csv(path, usecols=[‘Births’], engine=’python’)

print(dataframe.head())

dataset = dataframe.values

# split a sequence into samples
def window_creator_1(dataset, shift):
X,y = list(), list()
for i in range(len(dataset)):
if (i + shift) < (len(dataset) – 1):
end_ix = i + shift
# gather input and output parts of the pattern
seq_x, seq_y = dataset[i:end_ix, 0], dataset[end_ix, 0]
X.append(seq_x)
y.append(seq_y)

return array(X), array(y)

print(window_creator_1(dataset, 4))

Reply
- Jason Brownlee July 20, 2021 at 5:34 am #
  
  Well done!
  
  Reply
Robert July 20, 2021 at 6:40 pm #

lesson 3

I was inspired by your articles 🙂

import pandas
import numpy
import keras
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler

print(keras.__version__)

path = ‘daily-total-female-births.csv’

# loads only column with index 1 from daily-total-female-births.csv file
dataframe = pandas.read_csv(path, usecols=[‘Births’], engine=’python’)
print(dataframe.head())

# takes values from the dataset and returns a numpy array
dataset = dataframe.values
dataset = dataset.astype(‘float32’)

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

# split into train and test sets
train_size = int(len(dataset) * 0.8)
test_size = len(dataset) – train_size
train, test = dataset[0:train_size], dataset[train_size:len(dataset)]

# split a sequence into samples
def window_creator(dataset, shift):
X, y = list(), list()
for i in range(len(dataset)):
if (i + shift) < (len(dataset) – 1):
end_ix = i + shift
seq_x, seq_y = dataset[i:end_ix, 0], dataset[end_ix, 0]
X.append(seq_x)
y.append(seq_y)

return numpy.array(X), numpy.array(y)

# reshape into X=t and Y=t+1
shift = 8
trainX, trainY = window_creator(train, shift)
testX, testY = window_creator(test, shift)

print('trainX.shape ',trainX.shape)
print('trainY.shape ',trainY.shape)
print('testX.shape ',testX.shape)
print('testY.shape ',testY.shape)

# define model
model = Sequential()

model.add(Dense(64,activation = 'relu', input_shape=(trainX.shape[1],)))
model.add(Dropout(0.2))
model.add(Dense(64,activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(64,activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(64,activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1))

model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

model.summary()

model.fit(trainX, trainY, epochs=300, batch_size=16, verbose=0)

test_loss, test_acc = model.evaluate(testX, testY)

print("Test loss: %.2f%%" % (test_loss*100))
print("Test accuracy: %.2f%%" % (test_acc*100))

# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)

trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])

# shift test predictions for plotting
testPredictPlot = numpy.empty_like(dataset)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(trainPredict)+(shift*2)+1:len(dataset)-1, :] = testPredict

# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(testPredictPlot)
plt.show()

# compares predicted values with actual values
import colored
import math

j = 0
test = scaler.inverse_transform(dataset)
test = test.astype(int)

print('=' * 85)
for i in range(len(testPredictPlot)):
predict = testPredictPlot[i,0]
x = float(str(predict))
is_nan = math.isnan(x)

if is_nan is False:
if test[i, 0] != predict.astype(int):
print(colored.fg("black") + str(i+1) + '|',test[i, 0],'|', predict.astype(int),'|')
print('-' * 85)
else:
print(colored.fg("red") + str(i+1) + '|',test[i, 0], '|', predict.astype(int),'|')
print(colored.fg("black") + '-' * 85)
j+=1

print(colored.fg("green") + '100% accurate predictions: ',j)

Reply
- Jason Brownlee July 21, 2021 at 5:44 am #
  
  Nice work!
  
  Reply
Robert July 21, 2021 at 8:26 pm #

lesson 4

import pandas
import numpy
import keras
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from sklearn.preprocessing import MinMaxScaler

print(keras.__version__)

path = ‘daily-total-female-births.csv’

# loads only column with index 1 from daily-total-female-births.csv file
dataframe = pandas.read_csv(path, usecols=[‘Births’], engine=’python’)
print(dataframe.head())

# takes values from the dataset and returns a numpy array
dataset = dataframe.values
dataset = dataset.astype(‘float32’)

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

# split into train and test sets
train_size = int(len(dataset) * 0.8)
test_size = len(dataset) – train_size
train, test = dataset[0:train_size], dataset[train_size:len(dataset)]

# split a sequence into samples
def window_creator(dataset, shift):
X, y = list(), list()
for i in range(len(dataset)):
if (i + shift) < (len(dataset) – 1):
end_ix = i + shift
seq_x, seq_y = dataset[i:end_ix, 0], dataset[end_ix, 0]
X.append(seq_x)
y.append(seq_y)

return numpy.array(X), numpy.array(y)

# reshape into X=t and Y=t+1
shift = 8
trainX, trainY = window_creator(train, shift)
testX, testY = window_creator(test, shift)

print('trainX.shape ',trainX.shape)
print('trainY.shape ',trainY.shape)
print('testX.shape ',testX.shape)
print('testY.shape ',testY.shape)

# reshape from [samples, timesteps] into [samples, timesteps, features]
trainX = trainX.reshape(trainX.shape[0], trainX.shape[1], 1)
testX = testX.reshape(testX.shape[0], testX.shape[1], 1)

print('trainX.shape ',trainX.shape)
print('testX.shape ',testX.shape)

# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=7, activation='relu', padding = 'same', input_shape=(trainX.shape[1], trainX.shape[2])))
model.add(Conv1D(filters=64, kernel_size=3, padding = 'same', activation='relu'))
model.add(Conv1D(filters=64, kernel_size=2, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))

model.compile(optimizer='adam', loss='mse', metrics=['mae'])

model.summary()

history = model.fit(trainX, trainY, epochs=300, batch_size=4, verbose=1)

test_loss, test_mae = model.evaluate(testX, testY)

print("Test loss: %.2f%%" % (test_loss*100))
print("Test mae: %.2f%%" % (test_mae*100))

# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])

# shift test predictions for plotting
testPredictPlot = numpy.empty_like(dataset)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(trainPredict)+(shift*2)+1:len(dataset)-1, :] = testPredict

# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(testPredictPlot)
plt.show()

# compares predicted values with actual values
import colored
import math

j = 0
test = scaler.inverse_transform(dataset)
test = test.astype(int)

print('=' * 85)
for i in range(len(testPredictPlot)):
predict = testPredictPlot[i,0]
x = float(str(predict))
is_nan = math.isnan(x)

if is_nan is False:
if test[i, 0] != predict.astype(int):
print(colored.fg("black") + str(i+1) + '|',test[i, 0],'|', predict.astype(int),'|')
print('-' * 85)
else:
print(colored.fg("red") + str(i+1) + '|',test[i, 0], '|', predict.astype(int),'|')
print(colored.fg("black") + '-' * 85)
j+=1

print(colored.fg("green") + '100% accurate predictions: ',j)

Reply
- Jason Brownlee July 22, 2021 at 5:36 am #
  
  Nice work!
  
  Reply
Anitha July 24, 2021 at 3:09 am #

Hi,
Thanks for the mini course. I think the capability that CNN and RNN both have in time series forecasting is “auto detecting features”

Reply
- Jason Brownlee July 24, 2021 at 5:16 am #
  
  Yes
  
  Reply
Robert July 25, 2021 at 11:57 pm #

lesson 5

This time I used TimeseriesGenerator from the keras package 🙂

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import keras

from keras.preprocessing.sequence import TimeseriesGenerator
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM

from sklearn.preprocessing import MinMaxScaler

print(keras.__version__)

path = ‘daily-total-female-births.csv’

# loads only column with index 1 from daily-total-female-births.csv file
dataframe = pd.read_csv(path, usecols=[‘Births’], engine=’python’)
print(dataframe.head())

# takes values from the dataset and returns a numpy array
dataset = dataframe.values
dataset = dataset.astype(‘float32’)

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

# split into train and test sets
train_size = int(len(dataset) * 0.8)
test_size = len(dataset) – train_size
train, test = dataset[0:train_size], dataset[train_size:len(dataset)]

shift = 12
n_features = 1

train_generator = TimeseriesGenerator(train, train, length=shift, batch_size=1)
# number of samples
print(‘Samples train: %d’ % len(train_generator))

test_generator = TimeseriesGenerator(test, test, length=shift, batch_size=1)
# number of samples
print(‘Samples test: %d’ % len(test_generator))

model=Sequential()
model.add(LSTM(128,activation=’relu’,input_shape=(shift,n_features),return_sequences=True))
model.add(LSTM(64,activation=’relu’,return_sequences=True))
model.add(LSTM(64,activation=’relu’,return_sequences=True))
model.add(LSTM(16,activation=’relu’))
model.add(Dense(1))

model.compile(optimizer=’adam’, loss=’mse’, metrics=[‘mae’])

model.summary()

history = model.fit(train_generator, epochs = 200, batch_size=4, verbose=1)

test_loss, test_mae = model.evaluate(test_generator)

print(“Test loss: %.2f%%” % (test_loss*100))
print(“Test mae: %.2f%%” % (test_mae*100))

# make predictions
trainPredict = model.predict(train_generator)
print(trainPredict.shape)
testPredict = model.predict(test_generator)
print(testPredict.shape)

# invert predictions.
trainPredict = scaler.inverse_transform(trainPredict)
testPredict = scaler.inverse_transform(testPredict)

# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(shift*2):len(dataset), :] = testPredict

# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(testPredictPlot)
plt.show()

# compares predicted values with actual values
import colored
import math

j = 0
test = scaler.inverse_transform(dataset)
test = test.astype(int)

print(‘=’ * 85)
for i in range(len(testPredictPlot)):
predict = testPredictPlot[i,0]
x = float(str(predict))
is_nan = math.isnan(x)

if is_nan is False:
if test[i, 0] != predict.astype(int):
print(colored.fg(“black”) + str(i+1) + ‘|’,test[i, 0],’|’, predict.astype(int),’|’)
print(‘-‘ * 85)
else:
print(colored.fg(“red”) + str(i+1) + ‘|’,test[i, 0], ‘|’, predict.astype(int),’|’)
print(colored.fg(“black”) + ‘-‘ * 85)
j+=1

print(colored.fg(“green”) + ‘100% accurate predictions: ‘,j)

Reply
- Jason Brownlee July 26, 2021 at 5:30 am #
  
  Well done!
  
  Reply
Robert July 30, 2021 at 7:56 pm #

lesson 6

import pandas
import numpy
import keras
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers import TimeDistributed
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error

print(keras.__version__)

path = ‘daily-total-female-births.csv’

# loads only column with index 1 from daily-total-female-births.csv file
dataframe = pandas.read_csv(path, engine=’python’)
print(dataframe.head())

# plot dataframe
ax = plt.gca()
dataframe.plot(kind=’line’,x=’Date’,y=’Births’,ax=ax)
# plt.xticks(rotation=’vertical’)
plt.setp(ax.get_xticklabels(), rotation=45)
plt.grid()
plt.show()

# takes values from the dataset and returns a numpy array
dataset = dataframe[‘Births’].values
dataset = dataset.reshape(-1,1)
dataset = dataset.astype(‘float32’)

print(dataset.shape)

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

# split into train and test sets
train_size = int(len(dataset) * 0.8)
test_size = len(dataset) – train_size
train, test = dataset[0:train_size], dataset[train_size:len(dataset)]

# split a sequence into samples
def window_creator(dataset, shift):
X, y = list(), list()
for i in range(len(dataset)):
if (i + shift) < (len(dataset) – 1):
end_ix = i + shift
seq_x, seq_y = dataset[i:end_ix, 0], dataset[end_ix, 0]
X.append(seq_x)
y.append(seq_y)

return numpy.array(X), numpy.array(y)

# reshape into X=t and Y=t+1
shift = 4
trainX, trainY = window_creator(train, shift)
testX, testY = window_creator(test, shift)

print('trainX.shape ',trainX.shape)
print('trainY.shape ',trainY.shape)
print('testX.shape ',testX.shape)
print('testY.shape ',testY.shape)

# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]
sub_seq = shift//2
time_steps = shift//2

trainX = trainX.reshape(trainX.shape[0], sub_seq, time_steps, 1)
testX = testX.reshape(testX.shape[0], sub_seq, time_steps, 1)

print('trainX.shape ',trainX.shape)
print('testX.shape ',testX.shape)

# define model
model = Sequential()
model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, trainX.shape[2], trainX.shape[3])))
model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(64, activation='relu', return_sequences=True))
model.add(LSTM(64, activation='relu'))
model.add(Dense(1))

model.compile(optimizer='adam', loss='mse', metrics=['mae'])

model.summary()

# fit model
model.fit(trainX, trainY, epochs=1000, batch_size=16, verbose=0)

test_loss, test_mae = model.evaluate(testX, testY)

print("Test loss: %.2f%%" % (test_loss*100))
print("Test mae: %.2f%%" % (test_mae*100))

# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
# print(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])

# shift train predictions for plotting
trainPredictPlot = numpy.empty_like(dataset)
trainPredictPlot[:, :] = numpy.nan
trainPredictPlot[shift:len(trainPredict)+shift, :] = trainPredict

# shift test predictions for plotting
testPredictPlot = numpy.empty_like(dataset)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(trainPredict)+(shift*2)+1:len(dataset)-1, :] = testPredict

# plot baseline and predictions
plt.figure(figsize=(15, 6))
plt.plot(scaler.inverse_transform(dataset),alpha = 0.5, color = 'green', label = 'Dataset')
plt.plot(trainPredictPlot, label = 'Train predictions', color = 'k', linestyle = '–')
plt.plot(testPredictPlot, label = 'Test predictions', color = 'r', linestyle = '–')
plt.legend()
plt.grid()
plt.show()

import math

# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))

# compares predicted values with actual values
import colored

j = 0
test = scaler.inverse_transform(dataset)
test = test.astype(int)

print('=' * 85)
for i in range(len(testPredictPlot)):
predict = testPredictPlot[i,0]
x = float(str(predict))
is_nan = math.isnan(x)

if is_nan is False:
if test[i, 0] != predict.astype(int):
print(colored.fg("black") + str(i+1) + '|',test[i, 0],'|', predict.astype(int),'|')
print('-' * 85)
else:
print(colored.fg("red") + str(i+1) + '|',test[i, 0], '|', predict.astype(int),'|')
print(colored.fg("black") + '-' * 85)
j+=1

print(colored.fg("green") + '100% accurate predictions: ',j)

Reply
- Jason Brownlee July 31, 2021 at 5:35 am #
  
  Well done!
  
  Reply
Anitha August 3, 2021 at 2:32 am #

Hi Jason,

Thank you very much for very clean , precise and easy to understand explanation of very complex algorithms(MLP, CNN,LSTM etc). I would like to know about the “A dam version of stochastic gradient descent”. Please, if you could explain, will be greatly thankful to you.

Reply
- Jason Brownlee August 3, 2021 at 4:53 am #
  
  See this:
  https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/
  
  And this:
  https://machinelearningmastery.com/adam-optimization-from-scratch/
  
  Reply
Victor August 8, 2021 at 1:38 am #

hi Jason

lesson 2

IMPORT LIBS AND READ DATA

import pandas as pd
import numpy as np
import datetime as dt
import plotly.express as px

csv_to_read = ‘./Data/daily-total-female-births.csv’

df = pd.read_csv(csv_to_read, engine=’python’)
df.set_index(‘Date’, inplace=True)

EXPLORE DATA
df.head(5)
df.describe()
px.line(df, title=’Number of daily births’)
#Births values vary a lot (+/-16% in avg from day to day), so linear methods will hardly give good accuracy…

### monthly trends
df[‘month’] = df.Date.astype(np.datetime64).dt.month
month_gr = df.groupby([‘month’]).Births.sum()
px.line(month_gr)
#there is a monthly trend with a peak on Sep, so we should include months into our model to account long-term changes

### variations during a week
df[‘dayofweek’] = df.Date.astype(np.datetime64).dt.dayofweek
week_gr = df.groupby([‘dayofweek’]).Births.sum()
px.line(week_gr)
# we can see, that there are 10-15% less births in weekends in comparison to working days
# this might occur due to the limited availability of some obstetricians in weekends
# so we should include weekdays to our model to account slight changes within a week

INPUT COMPONENTS

# time lags
# using shift and moving average
# i want to make inputs a bit flexible for future experiments with the number of features, time windows etc.

def shift(series, i):
return series.shift(i)

def rolling(series, i):
return series.rolling(i).mean().shift(1)

def add_lags(df, cols, n_lags, func, ranged=True):
#df – dataframe with timeseries
#cols – one or more columns to make lags
#n_lags – number of days to past
#func – lag function (shift or some kind of averaging)
#ranged – if True, add multiple lags from t-1 to t-n_lags. If False, just add one t-n_lags

if type(cols) == str:
cols = [cols]
elif type(cols) != list:
cols = [str(cols)]

for col in cols:
col_prefix = col+’_’+ func.__name__ + ‘_’

if ranged:
for i in range(1, n_lags+1):
col_name = col_prefix + str(i)
df[col_name] = func(df[col], i)
else:
col_name = col_prefix + str(n_lags)
df[col_name] = func(df[col], n_lags)

return df

df = add_lags(df, ‘Births’, 4, shift, ranged=True)
df = add_lags(df, ‘Births’, 4, rolling, ranged=False)
df.dropna(inplace=True)

#after all transformations df looks like as following
df.head(4)
Date Births month dayofweek Births_shift_1 Births_shift_2 Births_shift_3 Births_shift_4 Births_rolling_4
4 1959-01-05 44 1 0 31.0 30.0 32.0 35.0 32.00
5 1959-01-06 29 1 1 44.0 31.0 30.0 32.0 34.25
6 1959-01-07 45 1 2 29.0 44.0 31.0 30.0 33.50
7 1959-01-08 43 1 3 45.0 29.0 44.0 31.0 37.25

LEARNING FORMAT TRASFORMATION
df.set_index(‘Date’, inplace=True)
y = df.Births.values
X = df[df.columns[1:]].values

X
array([[ 1. , 0. , 31. , …, 32. , 35. , 32. ],
[ 1. , 1. , 44. , …, 30. , 32. , 34.25],
[ 1. , 2. , 29. , …, 31. , 30. , 33.5 ],
…,
[12. , 1. , 52. , …, 34. , 44. , 41.75],
[12. , 2. , 48. , …, 37. , 34. , 42.75],
[12. , 3. , 55. , …, 52. , 37. , 48. ]])

Reply
- Jason Brownlee August 8, 2021 at 5:10 am #
  
  Well done!
  
  Reply
Asheesh Mathur October 18, 2021 at 11:32 pm #

Jason,
Thanks for wonderful tutorial(lesson 2), tried reading CSV file (Date,Birth). For supervised learning, may be we can drop date.

Now for viewing seasonality we should have month at least. Here in India, month makes difference as most marriages are scheduled in Nov-Dec-Jan qtr.

Also I just jumped in this bandwagon few days back.
Had heard about Transformers based models instead of RNN. Also Pytorch Forecasting is making strides.

Please advise.

Reply
- Adrian Tam October 20, 2021 at 9:48 am #
  
  If you are interested in PyTorch forecasting, check out Facebook’s Prophet library.
  
  Reply
ilaijasam November 16, 2021 at 10:16 pm #

I read your article. Thanks for sharing your ideas.

Reply
- Adrian Tam November 17, 2021 at 6:49 am #
  
  Thanks. Glad to know you like it.
  
  Reply
Luis November 21, 2021 at 6:02 am #

#Lesson2:
import pandas as pd
data = pd.read_csv(“daily-total-female-births.csv”)
X=[]
Y=[]
result=[]
for i in range(len(data)-3):
X.append(list(data[“Births”][i:i+3]))
Y.append(int(data[“Births”][i+3]))
result=pd.concat([pd.Series(X,name=’X’),pd.Series(Y,name=’Y’)],axis=1)

Reply
Luis Gutierrez November 25, 2021 at 6:34 am #

Hi Jason,

Many thanks for the tutorial. I’m still at lesson 03, but what I can see is that the variance of the predictions is much lower than the variance of the test set. Is that normal? swhat would be the explanation?

Test set:

253 34
254 40
255 56
256 44
257 53
.. ..
357 37
358 52
359 48
360 55
361 50

ytest.var()
Out[100]:
0 52.808189
dtype: float64

Predictions:

253 44.808697
254 40.148918
255 39.204170
256 41.366394
257 44.755791
.. …
357 42.322884
358 40.063240
359 42.569584
360 44.440350
361 43.073093

yhat.var()
Out[101]:
0 3.443268
dtype: float32

Regards

Reply
- Adrian Tam November 25, 2021 at 2:35 pm #
  
  Yes, normal. Because out-of-sample predictions is hard and expected to be less accurate.
  
  Reply
  - Luis Gutierrez December 2, 2021 at 6:53 am #
    
    Thanks for tour answer. I can see it happens with MLP and CNN
    
    MLP:
    https://ibb.co/r3n4qbC
    
    CNN:
    https://ibb.co/pL3WPHZ
    
    Regards
    
    Reply
Luis Gutierrez November 25, 2021 at 6:36 am #

BTW, how can I add plots? How can I insert the code in such a beautiful way?

Reply
- Adrian Tam November 25, 2021 at 2:34 pm #
  
  matplotlib?
  
  Reply
Luis Gutierrez December 1, 2021 at 3:38 am #

I mean, how can I add plots and code (not as plai text) here in the comments.

Reply
- Adrian Tam December 2, 2021 at 2:07 am #
  
  code can be using the “pre” HTML tag. And for plots, I think you need to upload it as image somewhere else and use the “img” HTML
  
  Reply
  - Luis Gutierrez December 2, 2021 at 6:46 am #
    
    Thanks again Adrian!
    
    Reply

Luis Gutierrez December 1, 2021 at 7:14 am #

Hi!

Regarding lesson 03, I still don;t see why we should reshape X. I tryed both and I get quite similar results.

Regards

Adrian Tam December 2, 2021 at 2:18 am #

I don’t think so. I see Keras complained the shape mismatch if reshape is skipped.

Luis Gutierrez December 2, 2021 at 6:45 am #

Thanks Adrian, but check this out:

import pandas as pd
from numpy import array
data = pd.read_csv("C:/Users/luisg/Documents/_DOCTORADO/Ejercicios/Brownlee/daily-total-female-births.csv")
X=[]
Y=[]
result=[]
for i in range(len(data)-3):
    X.append(data["Births"][i:i+3])
    Y.append(int(data["Births"][i+3]))

x=array(X)
y=array(Y)
splitrate=0.7

# define training and test dataset
xtrain=pd.DataFrame(x[:round(len(x)*splitrate)])
xtest=pd.DataFrame(x[round(len(x)*splitrate):],index=range(len(xtrain),len(X)))
ytrain=pd.DataFrame(y[:round(len(x)*splitrate)])
ytest=pd.DataFrame(y[round(len(x)*splitrate):],index=range(len(xtrain),len(X)))

from numpy import array
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D

# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(3, 1)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(xtrain, ytrain, epochs=1000, verbose=0)

# demonstrate prediction
yhat = model.predict(xtest, verbose=0)
yhat = pd.DataFrame(model.predict(xtest, verbose=0),index=range(len(xtrain),len(X)))
print(yhat)
print(yhat.aggregate(['mean','var','mad','quantile']))

import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(15, 5))
plt.plot(ytrain,label='Training Set')
plt.plot(ytest,label='Validation Set')
plt.plot(yhat,label='Predicitons')
ax.legend()
plt.show()

import pandas as pd

from numpy import array

data = pd.read_csv("C:/Users/luisg/Documents/_DOCTORADO/Ejercicios/Brownlee/daily-total-female-births.csv")

X=[]

Y=[]

result=[]

for i in range(len(data)-3):

X.append(data["Births"][i:i+3])

Y.append(int(data["Births"][i+3]))

x=array(X)

y=array(Y)

splitrate=0.7

# define training and test dataset

xtrain=pd.DataFrame(x[:round(len(x)*splitrate)])

xtest=pd.DataFrame(x[round(len(x)*splitrate):],index=range(len(xtrain),len(X)))

ytrain=pd.DataFrame(y[:round(len(x)*splitrate)])

ytest=pd.DataFrame(y[round(len(x)*splitrate):],index=range(len(xtrain),len(X)))

from numpy import array

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.convolutional import Conv1D

from keras.layers.convolutional import MaxPooling1D

# define model

model = Sequential()

model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(3, 1)))

model.add(MaxPooling1D(pool_size=2))

model.add(Flatten())

model.add(Dense(50, activation='relu'))

model.add(Dense(1))

model.compile(optimizer='adam', loss='mse')

# fit model

model.fit(xtrain, ytrain, epochs=1000, verbose=0)

# demonstrate prediction

yhat = model.predict(xtest, verbose=0)

yhat = pd.DataFrame(model.predict(xtest, verbose=0),index=range(len(xtrain),len(X)))

print(yhat)

print(yhat.aggregate(['mean','var','mad','quantile']))

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(15, 5))

plt.plot(ytrain,label='Training Set')

plt.plot(ytest,label='Validation Set')

plt.plot(yhat,label='Predicitons')

ax.legend()

plt.show()

If I run the code above 10 times, I get this:

[[80.625084]]
[[80.59208]]
[[80.59208]]
[[80.59208]]
[[80.59208]]
[[80.59208]]
[[80.59208]]
[[80.59208]]
[[80.59208]]
[[80.59208]]

[[80.625084]]

[[80.59208]]

Then, I do theses modifications:

xtrain=x[:round(len(x)*splitrate)]
xtest=x[round(len(x)*splitrate):]
ytrain=pd.DataFrame(y[:round(len(x)*splitrate)])
ytest=pd.DataFrame(y[round(len(x)*splitrate):],index=range(len(xtrain),len(X)))

xtrain = xtrain.reshape((xtrain.shape[0], xtrain.shape[1], 1))
xtest = xtest.reshape((xtest.shape[0], xtest.shape[1], 1))

xtrain=x[:round(len(x)*splitrate)]

xtest=x[round(len(x)*splitrate):]

ytrain=pd.DataFrame(y[:round(len(x)*splitrate)])

ytest=pd.DataFrame(y[round(len(x)*splitrate):],index=range(len(xtrain),len(X)))

xtrain = xtrain.reshape((xtrain.shape[0], xtrain.shape[1], 1))

xtest = xtest.reshape((xtest.shape[0], xtest.shape[1], 1))

If I run the code above 10 times, including those modifications, I get this:

[[80.8389]]
[[80.83764]]
[[80.83765]]
[[80.83765]]
[[80.83764]]
[[80.83764]]
[[80.83764]]
[[80.83764]]
[[80.83764]]
[[80.83764]]

[[80.8389]]

[[80.83764]]

[[80.83765]]

[[80.83764]]

Regards

Adrian Tam December 8, 2021 at 5:56 am #

The difference is not big. I think that’s just some random factor in the code to cause that.

Reply

digisol hub December 30, 2021 at 4:02 pm #

Digisol Hub is an advanced tech company, we can help you find solutions with your Website Development problems, Marketing of Social Media, development of applications (Apps), Graphic designing, Google Ads, and development of Software. Our aim is to help people solve their problems regarding digital marketing by using the right strategy and make them feel valued so they progress with their online business.
Digisol Hub

Reply
Evan March 28, 2022 at 4:20 pm #

Hi,

In practice, when should we use deep learning in lieu of traditional time series when doing forecating? It seems to me that deep learning would be the method of choice only with “big data”. Is it when (1) the sample size T is large or (2) the number of features (or explanatory variables, X) is high or either? Thanks!!
Correct me if I am wrong, but I can not envision a big gain of using deep learning from traditional time series models for a sample size of, say, T = 100.

Reply
Ye April 29, 2022 at 1:29 pm #

Hi Jason,

Thank you for your ML courses, they are very helpful.

I tried to download the Deep Learning for Time Series 7-day crash course, but after clicking the button and input my email address, it did not send out successfully (the sneding spinning symbol kept spinning there and no email received).

“Deep Learning for Time Series?
Take my free 7-day email crash course now (with sample code).”

Can you fix the send link?

Thank you

Reply
- James Carmichael May 2, 2022 at 9:29 am #
  
  Hi Ye…Did you enable pop ups in your browser for the following site?
  
  https://machinelearningmastery.com/how-to-get-started-with-deep-learning-for-time-series-forecasting-7-day-mini-course/
  
  You may also want to try another browser.
  
  Reply
  - Ye May 4, 2022 at 12:59 am #
    
    Thank you James. I had tried another browser and it was the same. Now I clicked on the link you provided in the reply, and it asked me to input an email and then mentioned I will receive an email in a few minutes. So the link you provided worked. Thank you again.
    
    Reply
    - Ye May 4, 2022 at 1:07 am #
      
      Sorry, its me again. The crash course I received in the email is crash course for computer vision, instead of time series. Can you please fix the link to send the “crash course for the time series” or email me the “crash course for the time series”? Thank you .
      
      Reply
Praveen Kumar May 30, 2022 at 9:08 pm #

Do we have any function like forecast() in sklearn() for deep learning

Reply
- James Carmichael May 31, 2022 at 9:48 am #
  
  Hi Praveen…The following may be of interest:
  
  https://machinelearningmastery.com/make-predictions-scikit-learn/
  
  Reply
Dudu February 8, 2023 at 12:56 am #

Your post is very Help full Thank you very much for sharing

Reply
- James Carmichael February 8, 2023 at 8:12 am #
  
  Thank you for your feedback and support Dudu! We greatly appreciate it!
  
  Reply
Muhammad Taimoor July 23, 2023 at 12:49 pm #

Lesson 01:
This is regarding how RNNs and CNNs can help someone in time series forecasting.

RNN:
An RNN is effective when we usually need past data to make a future prediction. The nature of RNNs by which they consider past values to make a decision for the future makes them a good choice for time series forecasting.

CNN:
CNNs have many variations, one of which is the 1 Dimensional CNN (1D CNN). In this CNN, the kernel moves along only one direction, and uses the convolution operation to extract meaningful features from time series data. The advantage of such networks is that they are computationally less expensive, and can easily learn features from raw data, thus eliminating the need of manual feature engineering.

Reply
- James Carmichael July 23, 2023 at 2:54 pm #
  
  Thank you Muhammad for your feedback! Keep up the great work!
  
  Reply
Vassil Dimitrov August 29, 2023 at 10:21 pm #

Lesson 1

CNNs are excellent for image recognition as they are able to identify specific features. This property can also be applied to time series data where a filter is slid across time steps in the 1D CNN layer to capture local patterns. The data can also be reshaped for a 2D CNN with time steps as rows and feature(s) as column(s). In this context, CNNs can be applied to multivariate time series data. Similar to image-based data, the feature map can be subjected to pooling to focus on the most important features for accurate prediction.

RNNs and LSTM nets naturally lend themselves to use in time series as they are designed for sequential data. They are particularly effective in situations where long-term dependencies are crucial.

I am looking forward to learning how a combination of the 2 architectures can be applied for accurate (as much as possible) predictions.

Many thanks!

Reply
Vassil Dimitrov August 30, 2023 at 2:30 am #

Lesson 2:

# Load modules
import numpy as np
import pandas as pd

# Read table
df = pd.read_csv(‘birth_rates.csv’)

# Function:
def prep_4_time_series (df, element_history=3):

# Define list to hold feature vectors (time window)
X = list()
# Define empty list to hold response variable (next after window)
y = list()

# Extract list of sequential birth numbers
df = df.sort_values(‘Date’)
births = list(df[‘Births’])

# Initial values for loop
min_el = element_history

# Extract sliding window and response variable
while min_el >= element_history:
min_el = np.min([els2include,len(births)])
if len(births) >= (els2include+1):
y.append(births[els2include])
a = births[:min_el]
X.append(a)
births = births[1:]
else:
break

# Return X and y
return X, y

# Call function
X, y = prep_4_time_series (df)

We get a moving window of 3 elements as the features and the 4th element as the target such that we cycle through all subsequent days.

Thank you!

Reply
Yonas Befirdu November 6, 2023 at 12:05 am #

CNN’s capture features and local patterns which maybe essential for shorter-term and trends. For instance Fluctuations or sudden changes in the data.
While RNNs capture sequential dependencies and long-term patterns. Allowing them to retain memories of past observations. For instance Stock price prediction or natural language applications(where position of words gives sentiments)

Reply
- James Carmichael November 6, 2023 at 9:29 am #
  
  Hi Yonas…CNNs are also excellent for time series classification and forecasting!
  
  The following resource is a great introduction:
  
  https://keras.io/examples/timeseries/timeseries_classification_from_scratch/
  
  Reply

Navigation

How to Get Started with Deep Learning for Time Series Forecasting (7-Day Mini-Course)

Deep Learning for Time Series Forecasting Crash Course.

Bring Deep Learning methods to Your Time Series project in 7 Days.

Who Is This Crash-Course For?

Crash-Course Overview

Need help with Deep Learning for Time Series?

Lesson 01: Promise of Deep Learning

Your Task

More Information

Lesson 02: How to Transform Data for Time Series

Your Task

More Information

Lesson 03: MLP for Time Series Forecasting

Your Task

More Information

Lesson 04: CNN for Time Series Forecasting

Your Task

More Information

Lesson 05: LSTM for Time Series Forecasting

Your Task

More Information

Lesson 06: CNN-LSTM for Time Series Forecasting

Your Task

More Information

Lesson 07: Encoder-Decoder LSTM Multi-step Forecasting

Your Task

More Information

The End!
(Look How Far You Have Come)

Summary

Develop Deep Learning models for Time Series Today!

Develop Your Own Forecasting models in Minutes

Finally Bring Deep Learning to your Time Series Forecasting Projects

More On This Topic

181 Responses to How to Get Started with Deep Learning for Time Series Forecasting (7-Day Mini-Course)

Leave a Reply Click here to cancel reply.

Navigation

Deep Learning for Time Series Forecasting Crash Course.

Bring Deep Learning methods to Your Time Series project in 7 Days.

Who Is This Crash-Course For?

Crash-Course Overview

Need help with Deep Learning for Time Series?

Lesson 01: Promise of Deep Learning

Your Task

More Information

Lesson 02: How to Transform Data for Time Series

Your Task

More Information

Lesson 03: MLP for Time Series Forecasting

Your Task

More Information

Lesson 04: CNN for Time Series Forecasting

Your Task

More Information

Lesson 05: LSTM for Time Series Forecasting

Your Task

More Information

Lesson 06: CNN-LSTM for Time Series Forecasting

Your Task

More Information

Lesson 07: Encoder-Decoder LSTM Multi-step Forecasting

Your Task

More Information

The End! (Look How Far You Have Come)

Summary

Develop Deep Learning models for Time Series Today!

Develop Your Own Forecasting models in Minutes

Finally Bring Deep Learning to your Time Series Forecasting Projects

More On This Topic

181 Responses to How to Get Started with Deep Learning for Time Series Forecasting (7-Day Mini-Course)

Leave a Reply Click here to cancel reply.

The End!
(Look How Far You Have Come)