Deep Learning Models for Multi-Output Regression

By Jason Brownlee on August 28, 2020 in Deep Learning 170

Multi-output regression involves predicting two or more numerical variables.

Unlike normal regression where a single value is predicted for each sample, multi-output regression requires specialized machine learning algorithms that support outputting multiple variables for each prediction.

Deep learning neural networks are an example of an algorithm that natively supports multi-output regression problems. Neural network models for multi-output regression tasks can be easily defined and evaluated using the Keras deep learning library.

In this tutorial, you will discover how to develop deep learning models for multi-output regression.

After completing this tutorial, you will know:

Multi-output regression is a predictive modeling task that involves two or more numerical output variables.
Neural network models can be configured for multi-output regression tasks.
How to evaluate a neural network for multi-output regression and make a prediction for new data.

Let’s get started.

Deep Learning Models for Multi-Output Regression
Photo by Christian Collins, some rights reserved.

Tutorial Overview

This tutorial is divided into three parts; they are:

Multi-Output Regression
Neural Networks for Multi-Outputs
Neural Network for Multi-Output Regression

Multi-Output Regression

Regression is a predictive modeling task that involves predicting a numerical output given some input.

It is different from classification tasks that involve predicting a class label.

Typically, a regression task involves predicting a single numeric value. Although, some tasks require predicting more than one numeric value. These tasks are referred to as multiple-output regression, or multi-output regression for short.

In multi-output regression, two or more outputs are required for each input sample, and the outputs are required simultaneously. The assumption is that the outputs are a function of the inputs.

We can create a synthetic multi-output regression dataset using the make_regression() function in the scikit-learn library.

Our dataset will have 1,000 samples with 10 input features, five of which will be relevant to the output and five of which will be redundant. The dataset will have three numeric outputs for each sample.

The complete example of creating and summarizing the synthetic multi-output regression dataset is listed below.

# example of a multi-output regression problem
from sklearn.datasets import make_regression
# create dataset
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=3, random_state=2)
# summarize shape
print(X.shape, y.shape)

# example of a multi-output regression problem

from sklearn.datasets import make_regression

# create dataset

X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=3, random_state=2)

# summarize shape

print(X.shape, y.shape)

Running the example creates the dataset and summarizes the shape of the input and output elements.

We can see that, as expected, there are 1,000 samples, each with 10 input features and three output features.

(1000, 10) (1000, 3)

1	(1000, 10) (1000, 3)

Next, let’s look at how we can develop neural network models for multiple-output regression tasks.

Neural Networks for Multi-Outputs

Many machine learning algorithms support multi-output regression natively.

Popular examples are decision trees and ensembles of decision trees. A limitation of decision trees for multi-output regression is that the relationships between inputs and outputs can be blocky or highly structured based on the training data.

Neural network models also support multi-output regression and have the benefit of learning a continuous function that can model a more graceful relationship between changes in input and output.

Multi-output regression can be supported directly by neural networks simply by specifying the number of target variables there are in the problem as the number of nodes in the output layer. For example, a task that has three output variables will require a neural network output layer with three nodes in the output layer, each with the linear (default) activation function.

We can demonstrate this using the Keras deep learning library.

We will define a multilayer perceptron (MLP) model for the multi-output regression task defined in the previous section.

Each sample has 10 inputs and three outputs, therefore, the network requires an input layer that expects 10 inputs specified via the “input_dim” argument in the first hidden layer and three nodes in the output layer.

We will use the popular ReLU activation function in the hidden layer. The hidden layer has 20 nodes, which were chosen after some trial and error. We will fit the model using mean absolute error (MAE) loss and the Adam version of stochastic gradient descent.

The definition of the network for the multi-output regression task is listed below.

...
# define the model
model = Sequential()
model.add(Dense(20, input_dim=10, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(3))
model.compile(loss='mae', optimizer='adam')

...

# define the model

model = Sequential()

model.add(Dense(20, input_dim=10, kernel_initializer='he_uniform', activation='relu'))

model.add(Dense(3))

model.compile(loss='mae', optimizer='adam')

You may want to adapt this model for your own multi-output regression task, therefore, we can create a function to define and return the model where the number of input and number of output variables are provided as arguments.

# get the model
def get_model(n_inputs, n_outputs):
	model = Sequential()
	model.add(Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
	model.add(Dense(n_outputs))
	model.compile(loss='mae', optimizer='adam')
	return model

# get the model

def get_model(n_inputs, n_outputs):

model = Sequential()

model.add(Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))

model.add(Dense(n_outputs))

model.compile(loss='mae', optimizer='adam')

return model

Now that we are familiar with how to define an MLP for multi-output regression, let’s explore how this model can be evaluated.

Neural Network for Multi-Output Regression

If the dataset is small, it is good practice to evaluate neural network models repeatedly on the same dataset and report the mean performance across the repeats.

This is because of the stochastic nature of the learning algorithm.

Additionally, it is good practice to use k-fold cross-validation instead of train/test splits of a dataset to get an unbiased estimate of model performance when making predictions on new data. Again, only if there is not too much data and the process can be completed in a reasonable time.

Taking this into account, we will evaluate the MLP model on the multi-output regression task using repeated k-fold cross-validation with 10 folds and three repeats.

Each fold the model is defined, fit, and evaluated. The scores are collected and can be summarized by reporting the mean and standard deviation.

The evaluate_model() function below takes the dataset, evaluates the model, and returns a list of evaluation scores, in this case, MAE scores.

# evaluate a model using repeated k-fold cross-validation
def evaluate_model(X, y):
	results = list()
	n_inputs, n_outputs = X.shape[1], y.shape[1]
	# define evaluation procedure
	cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)
	# enumerate folds
	for train_ix, test_ix in cv.split(X):
		# prepare data
		X_train, X_test = X[train_ix], X[test_ix]
		y_train, y_test = y[train_ix], y[test_ix]
		# define model
		model = get_model(n_inputs, n_outputs)
		# fit model
		model.fit(X_train, y_train, verbose=0, epochs=100)
		# evaluate model on test set
		mae = model.evaluate(X_test, y_test, verbose=0)
		# store result
		print('>%.3f' % mae)
		results.append(mae)
	return results

# evaluate a model using repeated k-fold cross-validation

def evaluate_model(X, y):

results = list()

n_inputs, n_outputs = X.shape[1], y.shape[1]

# define evaluation procedure

cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)

# enumerate folds

for train_ix, test_ix in cv.split(X):

# prepare data

X_train, X_test = X[train_ix], X[test_ix]

y_train, y_test = y[train_ix], y[test_ix]

# define model

model = get_model(n_inputs, n_outputs)

# fit model

model.fit(X_train, y_train, verbose=0, epochs=100)

# evaluate model on test set

mae = model.evaluate(X_test, y_test, verbose=0)

# store result

print('>%.3f' % mae)

results.append(mae)

return results

We can then load our dataset and evaluate the model and report the mean performance.

Tying this together, the complete example is listed below.

# mlp for multi-output regression
from numpy import mean
from numpy import std
from sklearn.datasets import make_regression
from sklearn.model_selection import RepeatedKFold
from keras.models import Sequential
from keras.layers import Dense

# get the dataset
def get_dataset():
	X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=3, random_state=2)
	return X, y

# get the model
def get_model(n_inputs, n_outputs):
	model = Sequential()
	model.add(Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
	model.add(Dense(n_outputs))
	model.compile(loss='mae', optimizer='adam')
	return model

# evaluate a model using repeated k-fold cross-validation
def evaluate_model(X, y):
	results = list()
	n_inputs, n_outputs = X.shape[1], y.shape[1]
	# define evaluation procedure
	cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)
	# enumerate folds
	for train_ix, test_ix in cv.split(X):
		# prepare data
		X_train, X_test = X[train_ix], X[test_ix]
		y_train, y_test = y[train_ix], y[test_ix]
		# define model
		model = get_model(n_inputs, n_outputs)
		# fit model
		model.fit(X_train, y_train, verbose=0, epochs=100)
		# evaluate model on test set
		mae = model.evaluate(X_test, y_test, verbose=0)
		# store result
		print('>%.3f' % mae)
		results.append(mae)
	return results

# load dataset
X, y = get_dataset()
# evaluate model
results = evaluate_model(X, y)
# summarize performance
print('MAE: %.3f (%.3f)' % (mean(results), std(results)))

# mlp for multi-output regression

from numpy import mean

from numpy import std

from sklearn.datasets import make_regression

from sklearn.model_selection import RepeatedKFold

from keras.models import Sequential

from keras.layers import Dense

# get the dataset

def get_dataset():

X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=3, random_state=2)

return X, y

# get the model

def get_model(n_inputs, n_outputs):

model = Sequential()

model.add(Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))

model.add(Dense(n_outputs))

model.compile(loss='mae', optimizer='adam')

return model

# evaluate a model using repeated k-fold cross-validation

def evaluate_model(X, y):

results = list()

n_inputs, n_outputs = X.shape[1], y.shape[1]

# define evaluation procedure

cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)

# enumerate folds

for train_ix, test_ix in cv.split(X):

# prepare data

X_train, X_test = X[train_ix], X[test_ix]

y_train, y_test = y[train_ix], y[test_ix]

# define model

model = get_model(n_inputs, n_outputs)

# fit model

model.fit(X_train, y_train, verbose=0, epochs=100)

# evaluate model on test set

mae = model.evaluate(X_test, y_test, verbose=0)

# store result

print('>%.3f' % mae)

results.append(mae)

return results

# load dataset

X, y = get_dataset()

# evaluate model

results = evaluate_model(X, y)

# summarize performance

print('MAE: %.3f (%.3f)' % (mean(results), std(results)))

Running the example reports the MAE for each fold and each repeat, to give an idea of the evaluation progress.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

At the end, the mean and standard deviation MAE is reported. In this case, the model is shown to achieve a MAE of about 8.184.

You can use this code as a template for evaluating MLP models on your own multi-output regression tasks. The number of nodes and layers in the model can easily be adapted and tailored to the complexity of your dataset.

...
>8.054
>7.562
>9.026
>8.541
>6.744
MAE: 8.184 (1.032)

...

>8.054

>7.562

>9.026

>8.541

>6.744

MAE: 8.184 (1.032)

Once a model configuration is chosen, we can use it to fit a final model on all available data and make a prediction for new data.

The example below demonstrates this by first fitting the MLP model on the entire multi-output regression dataset, then calling the predict() function on the saved model in order to make a prediction for a new row of data.

# use mlp for prediction on multi-output regression
from numpy import asarray
from sklearn.datasets import make_regression
from keras.models import Sequential
from keras.layers import Dense

# get the dataset
def get_dataset():
	X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=3, random_state=2)
	return X, y

# get the model
def get_model(n_inputs, n_outputs):
	model = Sequential()
	model.add(Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
	model.add(Dense(n_outputs, kernel_initializer='he_uniform'))
	model.compile(loss='mae', optimizer='adam')
	return model

# load dataset
X, y = get_dataset()
n_inputs, n_outputs = X.shape[1], y.shape[1]
# get model
model = get_model(n_inputs, n_outputs)
# fit the model on all data
model.fit(X, y, verbose=0, epochs=100)
# make a prediction for new data
row = [-0.99859353,2.19284309,-0.42632569,-0.21043258,-1.13655612,-0.55671602,-0.63169045,-0.87625098,-0.99445578,-0.3677487]
newX = asarray([row])
yhat = model.predict(newX)
print('Predicted: %s' % yhat[0])

# use mlp for prediction on multi-output regression

from numpy import asarray

from sklearn.datasets import make_regression

from keras.models import Sequential

from keras.layers import Dense

# get the dataset

def get_dataset():

X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=3, random_state=2)

return X, y

# get the model

def get_model(n_inputs, n_outputs):

model = Sequential()

model.add(Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))

model.add(Dense(n_outputs, kernel_initializer='he_uniform'))

model.compile(loss='mae', optimizer='adam')

return model

# load dataset

X, y = get_dataset()

n_inputs, n_outputs = X.shape[1], y.shape[1]

# get model

model = get_model(n_inputs, n_outputs)

# fit the model on all data

model.fit(X, y, verbose=0, epochs=100)

# make a prediction for new data

row = [-0.99859353,2.19284309,-0.42632569,-0.21043258,-1.13655612,-0.55671602,-0.63169045,-0.87625098,-0.99445578,-0.3677487]

newX = asarray([row])

yhat = model.predict(newX)

print('Predicted: %s' % yhat[0])

Running the example fits the model and makes a prediction for a new row.

As expected, the prediction contains three output variables required for the multi-output regression task.

Predicted: [-152.22713 -78.04891 -91.97194]

1	Predicted: [-152.22713 -78.04891 -91.97194]

Summary

In this tutorial, you discovered how to develop deep learning models for multi-output regression.

Specifically, you learned:

Multi-output regression is a predictive modeling task that involves two or more numerical output variables.
Neural network models can be configured for multi-output regression tasks.
How to evaluate a neural network for multi-output regression and make a prediction for new data.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

170 Responses to Deep Learning Models for Multi-Output Regression

Doru August 28, 2020 at 7:30 pm #

Lors de l’exécution est générée l’erreur suivante:

File “C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py”, line 703, in is_tensor
return isinstance(x, tf_ops._TensorLike) or tf_ops.is_dense_tensor_like(x)

AttributeError: module ‘tensorflow.python.framework.ops’ has no attribute ‘_TensorLike’

Comment résoudre?

Reply
- Jason Brownlee August 29, 2020 at 7:59 am #
  
  Sorry to hear that, perhaps confirm that your version of Keras and TensorFlow are up to date:
  https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
  
  The tips here may also help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Doru August 28, 2020 at 8:50 pm #

Hy
Running this example I get the following error:

AttributeError: module ‘tensorflow.python.framework.ops’ has no attribute ‘_TensorLike’

A solution?

Reply
- Jason Brownlee August 29, 2020 at 8:00 am #
  
  What version of Keras and TensorFlow are you using?
  
  You must use the follow versions or higher:
  
  tensorflow: 2.3.0 keras: 2.4.3
  
  1
  2
  
  tensorflow: 2.3.0
  keras: 2.4.3
  
  Reply
fabou August 29, 2020 at 6:44 am #

Hi Jason,

could you tell me why a model is compiled for each cross validation fold?

Reply
- Jason Brownlee August 29, 2020 at 8:08 am #
  
  Each model requires training a new model from scratch in order to establish an unbiased estimate of model performance when making predictions on out of sample instances.
  
  If you are new to k-fold cross-validation, you can get started here:
  https://machinelearningmastery.com/k-fold-cross-validation/
  
  Reply
  - fabou August 30, 2020 at 1:14 am #
    
    I thought a model had to be instantiated once and then was passed to the cross validation loop. I missed something I guess…
    
    In the example below taken from one of your excellent article ( Add Binary Flags for Missing Values ) I understand that the model is instantiated and that this instance of the model is evaluated on each cross validation training folds :
    
    [ step 1 ] model = RandomForest()
    
    [ step 2 ] cv = RepeatedStratifiedKFold( n_splits = 10 , n_repeats = 3 )
    
    [ step3 ] scores = cross_val_score( model , X , y , scoring = ‘accuracy’ , cv = cv , n_jobs = -1 )
    
    Here it seems to me that for each training fold a model is instanciated and then evaluated on the corresponding testing folds.
    
    Both logic seems different… or I missed something.
    
    Reply
    - fabou August 30, 2020 at 1:36 am #
      
      I tested passing the model to the evaluation function :
      
      def evaluate_model( model , X , y ) :
      
      results = list()
      
      n_inputs = X.shape[ 1 ]
      
      n_outputs = y.shape[ 1 ]
      
      cv = RepeatedKFold( n_splits = 10 , n_repeats = 1 , random_state = 999 )
      
      for train_ix, test_ix in cv.split( X ) :
      
      X_train, X_test = X[train_ix], X[test_ix]
      
      y_train, y_test = y[train_ix], y[test_ix]
      
      model.fit( X_train , y_train , verbose = 0 , epochs = 100 )
      
      mae = model.evaluate( X_test , y_test , verbose = 0 )
      
      results.append( mae )
      
      print( f’mae : {mae:.3f}’ )
      
      return results
      
      mae keeps decreasing… I do not understand why…
      
      Reply
      - Jason Brownlee August 30, 2020 at 6:42 am #
        
        It is invalid as you continue to train the same model each loop.
        
        The model must be re-defined and re-fit each cross-validation loop otherwise the evaluation is optimistic.
    - Jason Brownlee August 30, 2020 at 6:41 am #
      
      It does the same thing.
      
      In that case, the model is created and fit a new for each cross-validation fold. You just don’t see it as it happens internally to the function.
      
      Reply
      - fabou August 30, 2020 at 4:58 pm #
        
        OK. I thought the model was created once and then fitted on each cross validation fold.
        
        I have been doing a little experiment.
        
        cv = RepeatedStratifiedKFold( n_splits = 10 , n_repeats = 1 , random_state = 777 )
        
        and then [ A ] :
        
        > scores_A = []
        
        > for train_ix , test_ix in cv.split( X , y ) :
        
        > X_train , X_test = X[ train_ix , : ] , X[ test_ix , : ]
        
        > y_train , y_test = y[ train_ix ] , y[ test_ix ]
        
        > model = AdaBoostClassifier()
        
        > model.fit( X_train , y_train )
        
        > y_test_pred = model.predict( X_test )
        
        > score = roc_auc_score( y_test , y_test_pred )
        
        > print( f’score : {score}’ )
        
        > scores_A.append( score )
        
        After that [ B ] :
        
        > scores_B = []
        
        > model = AdaBoostClassifier()
        
        > for train_ix , test_ix in cv.split( X , y ) :
        
        > X_train , X_test = X[ train_ix , : ] , X[ test_ix , : ]
        
        > y_train , y_test = y[ train_ix ] , y[ test_ix ]
        
        > #model = AdaBoostClassifier()
        
        > model.fit( X_train , y_train )
        
        > y_test_pred = model.predict( X_test )
        
        > score = roc_auc_score( y_test , y_test_pred )
        
        > print( f’score : {score}’ )
        
        > scores_B.append( score )
        
        The difference between [ A ] and [ B ] is that with [ B ] the model is instantiated outside the loop once whereas with [ A ] it is created anew for each fold.
        
        scores_A and _B are the same ( even when I change the random state parameter for RepeatedStratifiedKFold and/or the model used : RandomForest…).
        
        This experiment makes me think the model has to be created once and be fitted on each cross validation fold.
        
        For me, the sole difference between model_A = AdaBoostClassifier() an model_B = AdaBoostClassifier() is there memory location this is why I do not understand why the get_model function is called in the evaluate_model function…
        
        PS :
        
        I was not able to replicate the scores given by
        
        > cross_val_score( model , X , y , scoring = ‘roc_auc’ , cv = cv , n_jobs = -1 )
      - Jason Brownlee August 31, 2020 at 6:09 am #
        
        The cross-validation procedure requires the model be re-fit each evaluation.
        
        Internally cross_val_score will clone the model and refit from scratch each iteration. This is functionality equilivient to re-defining and re-fitting each iteration.
Sam August 29, 2020 at 7:42 am #

HI Jason,

Just out of curiosity, why would you build a multi-output model instead of multiple models of single outputs? When would it be better/worse?

Reply
- Jason Brownlee August 29, 2020 at 8:11 am #
  
  Try both and see what works best for your specific dataset.
  
  Reply
  - Vania June 26, 2021 at 2:08 pm #
    
    Hi Jason. How can I apply those codes into specific datasets? Since you were using random values.
    
    Reply
    - Jason Brownlee June 27, 2021 at 4:36 am #
      
      This will help you to load your dataset:
      https://machinelearningmastery.com/load-machine-learning-data-python/
      
      Reply
Konstantinos September 4, 2020 at 8:54 am #

Hi Jason,

First of all, congratulations on your website and for your detailed explanations and examples. I really appreciate it!

I have a question for you, as I’ve already spent a considerable amount of time searching online, without significant success. How can we handle a situation where we have partial ground truth for our targets? (e.g. output vectors with some NaN values)

So far, I understand that if you provide a y_train with NaNs, then the loss function won’t behave properly. If we provided a loss function for each output variable (I think keras allows that with loss=[‘mse’, ‘mse’, …]), then in theory, we could dismiss a group of NaNs within a batch of an output variable, by filtering them out (practically making both y_true and y_pred = 0).

The problem is that if we create a custom loss function and try to replace NaNs with 0s in both vectors, then keras throws an error (I think it relates to the use of non-tensorflow functions to filter nans, thus making keras unable to compute the derivative of the loss function)

Could you think of an elegant solution on this problem? Or maybe a completely different approach? (i assume that this way is sensible from the optimization perspective)

Thanks in advance for your time!

Reply
- Jason Brownlee September 4, 2020 at 1:36 pm #
  
  Thanks!
  
  It is common to replace missing inputs with an imputed value using statistics or a model.
  
  For some models, you can mark missing values with a special value and allow the model to treat missing as just another value.
  
  Also, you can mark missing values for some models and configure them to ignore them, e.g. a masking layer in neural nets.
  
  I hope that gives you some ideas.
  
  Reply
  - Konstantinos September 4, 2020 at 7:21 pm #
    
    Hi Jason,
    
    Thanks for your reply.
    
    In this case, it’s not missing inputs, and I can’t really use any statistics to learn the missing targets.
    
    Regarding the special values, I haven’t seen anything related in Keras. Are you aware of such symbol?
    
    Also, I checked the masking layers, but that’s again only for the inputs. This wouldn’t affect the y_true in the loss function.
    
    In general, it seems like a trivial problem which doesn’t seem to have a trivial solution…
    Maybe I need to check how people handle missing labels on multi-label classification.
    
    Hopefully I’ll figure it out soon!
    
    Reply
    - Jason Brownlee September 5, 2020 at 6:45 am #
      
      You can use zero padding in the putput (yhat) then manually ignore zero padded values when using the output or evaluating predoctions.
      
      This is very common in seq2seq problems in NLP.
      
      Reply
  - Ali April 15, 2023 at 9:40 am #
    
    Hi Jason
    
    I have data set with 4200 input values. I need two outputs ( regression) .
    I have tried single output separately and I get r2 score of .999 on test set and prediction
    
    But using the functional API, I get r2 score of .71 at best.
    
    I want to know what is the limitation.? How is this happening
    
    Reply
    - James Carmichael April 16, 2023 at 9:28 am #
      
      Hi Ali…You may be working on a regression problem and achieve zero prediction errors.
      
      Alternately, you may be working on a classification problem and achieve 100% accuracy.
      
      This is unusual and there are many possible reasons for this, including:
      
      You are evaluating model performance on the training set by accident.
      Your hold out dataset (train or validation) is too small or unrepresentative.
      You have introduced a bug into your code and it is doing something different from what you expect.
      Your prediction problem is easy or trivial and may not require machine learning.
      The most common reason is that your hold out dataset is too small or not representative of the broader problem.
      
      This can be addressed by:
      
      Using k-fold cross-validation to estimate model performance instead of a train/test split.
      Gather more data.
      Use a different split of data for train and test, such as 50/50.
      
      Reply
robgonz September 5, 2020 at 9:03 am #

This is a great nn-model for regression. I tried using mean sq log error for the loss, so I can interpret the reslt a bit better.

Now, when you evaluate or use the evaluation, is there a difference if you setup the repeat times as desired? Or just keep calling the evaluation function with just 2 or 3 repeats?

Just wandering with this algorithm as I never use tried RepeatedKFolds before.

Thanks for sharing J.

Reply
- Jason Brownlee September 6, 2020 at 6:01 am #
  
  Perhaps you can keep the loss as mse, and use sq log error as a metric?
  
  Repeated k-fold cross-validation is a good practice for stochastic models, if they are not too expensive to fit. 3 repeats is conservative, 10 or 30 is better.
  
  More here:
  https://machinelearningmastery.com/repeated-k-fold-cross-validation-with-python/
  
  Reply
Pratyay Mukherjee September 6, 2020 at 3:10 pm #

can you please explain how the mean absolute error in Keras loss function is calculated on multi-output vectors? Is it the average of corresponding values in vectors averaged over dimensions of the output and samples?

Reply
- Jason Brownlee September 7, 2020 at 8:25 am #
  
  I believe it is error averaged across variables and samples.
  
  Reply
zeinab September 14, 2020 at 10:35 pm #

Hello
I have a question
My data is not time dependent
Not image and video
Which Deep Learning model is more suitable for predicting my data?
Thanks for guiding me

Reply
- Jason Brownlee September 15, 2020 at 5:24 am #
  
  An MLP.
  
  Also, this will help:
  https://machinelearningmastery.com/when-to-use-mlp-cnn-and-rnn-neural-networks/
  
  Reply
j.sanjay September 29, 2020 at 11:30 am #

Hi Jason,
Can you be my teacher

Reply
- Jason Brownlee September 29, 2020 at 12:52 pm #
  
  I’m happy to answer questions about machine learning.
  
  Reply
Noodle October 7, 2020 at 11:16 pm #

Hi, thanks for the tutorial.

If the goal is simply prediction, what are the benefits of fitting a multi-output NN instead of multiple single-output NN? What are the gains in this case? Is there a paper or reference you can recommend?

Reply
- Jason Brownlee October 8, 2020 at 8:31 am #
  
  The problem will require a single or multiple outputs. You must use the model that achieves the goals of the project.
  
  Reply
Davide October 9, 2020 at 4:44 am #

Hi, thank you for this article!

I didn’t get once you go through the Cross validation step how you choose one configuration with respect to the others in order to make prediction on new data. I mean, once all the loop is finished what has to be done?

Reply
- Jason Brownlee October 9, 2020 at 6:49 am #
  
  See this:
  https://machinelearningmastery.com/make-predictions-scikit-learn/
  
  Reply
  - Davide Mori October 9, 2020 at 8:05 pm #
    
    Ok I think I’ve understood.
    Based on what you have written in this article, I evaluate the model using k-fold-cross validation and based on what the score is I change the Layers and their parameters in my model accordingly. Then I make my network learning on all the data already used during validation and finally I make the predictions.
    To conclude, Validation is used just to measure the performance of the model I have built, is that right?
    
    Reply
    - Jason Brownlee October 10, 2020 at 7:05 am #
      
      The train set is used to fit the model.
      The test set is used to evaluate the model.
      The validation set can be used to tune the model or stop training the model at the right time.
      
      More here:
      https://machinelearningmastery.com/difference-test-validation-datasets/
      
      Reply
Urs October 10, 2020 at 12:42 am #

Wondering about this: With “model.compile(loss=’mae’, optimizer=’adam’)”, I basically instruct keras to minimize the combined loss of both output values together. This averages out the two errors stemming from the two separate target values, which could lead to a large positive error on target 1 which is compensated by a large negative error on target 2.

I imagine to avoid this, I would have to create 2 separate submodels that optimize each loss individually – probably using the functional API?

Reply
- Jason Brownlee October 10, 2020 at 7:07 am #
  
  Yes.
  
  Another approach, a model with two output and two losses (so-called multi-output model):
  https://machinelearningmastery.com/keras-functional-api-deep-learning/
  
  Reply
Julm October 14, 2020 at 2:00 am #

Hi, I really appreciate your content!

I am starting with deep learning models and I have a project on mind.

What I want to do, giving certain input values (for example: 10 features), I want to predict a curve. I mean, I want to obtain a 2D curve (where Y axis will be force and X axis will be time). Can “Multi-output regression” be a solution? I will predict time steps (t0=5, t1=16, t2=26…).

What other solution can I use for this case?

thanks in advance

Reply
- Jason Brownlee October 14, 2020 at 6:22 am #
  
  If you want a curve, then perhaps use curve fitting directly:
  https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
  
  Reply
pratyush October 22, 2020 at 4:41 am #

I tested this on my data set and found that it worked if iloc was used. ex:

X_train, X_test = X.iloc[train_ix], X.iloc[test_ix]
y_train, y_test = y.iloc[train_ix], y.iloc[test_ix]

Reply
- Jason Brownlee October 22, 2020 at 6:51 am #
  
  Nice work!
  
  Reply
Igor November 6, 2020 at 12:09 pm #

Can I do this to forecast t+1, t+2 and so on?

Reply
- Jason Brownlee November 6, 2020 at 1:15 pm #
  
  Yes.
  
  Reply
Luz N November 7, 2020 at 1:11 pm #

Hi Jason, how would it be for multi-output binary classification?
For example, forecast if t+1 is 0 or 1 and t+2 is 0 or 1 and t+3 is 0 or 1.
1. Output layer: model.add(Dense(n_outputs, activation=’sigmoid’))?
2. Loss function: ‘binary_crossentropy’?

Thanks in advance.
Great tutorial!!!

Reply
- Jason Brownlee November 8, 2020 at 6:37 am #
  
  Yes, I believe so. Try it and see.
  
  Reply
Yuchen November 20, 2020 at 8:45 am #

Hi Jason, first of all, thanks for your great content! I am also using MLP for doing some multi-output regression, but I found that when I tested the model, the output would always be the same regardless of the input (this also happened at training stage). I have normalized my input data and the dimension of my input and output are 450 and 120, respectively, also I used tanh activation function to bound my output within range [-1,1]. Do you have any suggestions on this? Thanks in advance 🙂

Reply
- Jason Brownlee November 20, 2020 at 1:04 pm #
  
  It may suggestion your model require further tuning for your dataset.
  
  The suggestions here will help:
  https://machinelearningmastery.com/start-here/#better
  
  Reply
Chiedozie November 24, 2020 at 3:23 am #

Hey Jason, I really love your works here. I’m working on a dataset_1 where the output classes of this dataset_1 has some other features dataset, say dataset_2, that the model could also learn from. The challenge I’m facing is that dataset_1 and dataset_2 are totally different, so there’s no way I could merge them on some common features.
I’d like to know if there’s a way to train a model that would be able to learn on dataset_1 then subsequently learn on dataset_2?

I am considering using dimensionality reduction to reduce the features of the dataset_2 to a single value and then use this single value as an output to dataset_1 in a multi-output model. Do you think this is a good approach?

Reply
- Jason Brownlee November 24, 2020 at 6:21 am #
  
  Thanks!
  
  Try it and compare to other approaches.
  
  Perhaps try to ensemble their predictions?
  
  Reply
Goutam December 1, 2020 at 5:55 pm #

Thanks a lot for this amazing article. I love your works. I actually need some advice on a multioutput regression problem. In the problem, the outputs are percentage. and all the outputs need to sum up to 1.
For example, i have to predict 3 outputs A, B, C. each of them can be any value between 0 and 1. let’s say,
A = 0.6,B = 0.3,C = 0.1 or A = 1.0, B = 0.0, C = 0.0
so, A+B+C is 1. how do i approach this problem.
Thanks in advance.

Reply
- Jason Brownlee December 2, 2020 at 7:38 am #
  
  Thanks!
  
  Use a softmax activation:
  https://machinelearningmastery.com/softmax-activation-function-with-python/
  
  Reply
Dinani December 12, 2020 at 8:10 pm #

Hi
I appreciate your post. I have a question, how to determine the number of hidden layers and their dense? Is there any relation between them and the shape of input/output?

Reply
- Jason Brownlee December 13, 2020 at 6:00 am #
  
  Thanks.
  
  Good question, I answer it here:
  https://machinelearningmastery.com/faq/single-faq/how-many-layers-and-nodes-do-i-need-in-my-neural-network
  
  Reply
Joshua December 19, 2020 at 9:55 am #

Hey Jason,

Absolutely enjoy reading these concise tutorials. What are the available metrics for multi-output regression models when using grid search? When compiling the model, any other metric besides [“Accuracy”] raises an error. Is accuracy the MAE or R^2 value in this case?

Reply
- Jason Brownlee December 19, 2020 at 1:27 pm #
  
  Thanks.
  
  You can use the same old regression metrics either for overall error or per output variable error.
  
  Accuracy is inappropriate for regression, you can use MSE, RMSE or MAE as common error metrics.
  
  Reply
Maha December 30, 2020 at 6:20 am #

Hi
If I want to build deep model , as output will be vector of 20 elements and each element will take a value from on two possible values …..what kind of system i can use??,, any hep ..to solve this problem

Reply
- Jason Brownlee December 30, 2020 at 6:46 am #
  
  I recommend testing a suite of different models in order to discover what works well or best for your dataset.
  
  Perhaps start with a simple MLP.
  
  Reply
  - Maha December 30, 2020 at 7:15 am #
    
    thanks for reply
    if any link to MLP example to fellow??
    
    Reply
    - Jason Brownlee December 30, 2020 at 7:49 am #
      
      Yes, the above tutorial is exactly this, an MLP (multilayer perceptron) for multi-output regression.
      
      Reply
Katherine January 5, 2021 at 6:10 am #

Hi there, first of all thank you for the tutorial! This website has been immensely helpful for me in recent months as I’ve started to incorporate machine learning into my research. My main question for this post is why you chose he_uniform for your kernel initialization on the output layer. I know that it’s the best choice for relu activations, but I can’t find anything anywhere saying that it should also be used for linear activation. Maybe you have a good source/reason for this choice here? To be honest, I can’t find much about what is the best canned initializer for linear activation so I’m really curious to hear your thoughts.

Reply
- Jason Brownlee January 5, 2021 at 6:30 am #
  
  You’re welcome!
  
  It probably should not be used for linear activation. Probably won’t make a difference though.
  
  Reply
  - Ramesh March 26, 2023 at 9:06 am #
    
    Hi, how can i get multiple text output after training model for multiple text input features,
    
    Reply
    - James Carmichael March 26, 2023 at 10:23 am #
      
      Hi Ramesh…You may find the following resource of interest:
      
      https://blog.paperspace.com/combining-multiple-features-outputs-keras/
      
      Reply
Maha January 14, 2021 at 8:33 am #

Hi
If there is example based on dividing data set into training and testing Multi-Output Regression

Reply
- Jason Brownlee January 14, 2021 at 8:46 am #
  
  It would be the same for single output regression:
  https://machinelearningmastery.com/train-test-split-for-evaluating-machine-learning-algorithms/
  
  Unless, you have a time series, then this will help:
  https://machinelearningmastery.com/time-series-forecasting-supervised-learning/
  
  Reply
Maha January 15, 2021 at 6:18 am #

Hi,
you mentioned that if the data set is small , can use k folds for evaluation …what about if the data set is large?? , for example ..instead of 1000 , maybe 5000 or 10000

Reply
- Jason Brownlee January 15, 2021 at 8:44 am #
  
  Those are still small datasets, and k-fold cross validation.
  
  Nevertheless, for very large data (millions/billions of samples) or for data where small samples are stable/stationary, you can use a train/test split.
  
  Reply
  - Maha January 16, 2021 at 10:45 am #
    
    okay..Thanks alots Jason
    
    Reply
Ladu Tandel January 20, 2021 at 8:29 pm #

Hi ,

May I know how to import x y data which i have as excel instead of auto generated x y data as per above code.

I want to predict multiple outputs and above concept is nice to start with, but since i am new to coding would like to get help on how to import my excel data of multiple xs and multiple ys

Thank you
Your help will be much appreciated

Reply
- Jason Brownlee January 21, 2021 at 6:48 am #
  
  Save your excel to CVS, then load your CSV as a numpy array:
  https://machinelearningmastery.com/load-machine-learning-data-python/
  
  Reply
Sada Hussain January 27, 2021 at 5:21 am #

which deep learning will be used for prediction of crop production ,humidity ,temperature and pH and also rain fall ?

Reply
Sean Murphy February 26, 2021 at 9:53 am #

Hi, just wondering, how can this neural network use multiple inputs and outputs if it is using Keras’s sequential API? Everything I’ve read online says that a Functional API is required for multiple inputs and outputs? thanks!

Reply
- Jason Brownlee February 26, 2021 at 1:27 pm #
  
  Yes, the functional API is required if you want to use separate input models or output models.
  
  To have a vector input or vector output the sequential model can be used directly.
  
  Reply
  - Sean Murphy February 27, 2021 at 3:34 am #
    
    Thank you for your reply, I really appreciate it. Would it be possible to explain the difference between using “seperate input models” and simply using “vector input or vector output” if it wouldn’t be too much trouble? For example, I have data with 5 numerical inputs and 9 numerical outputs. The inputs represent the performance characteristics of a op-amp, e.g gain and slew rate, the outputs are the widths of transistors which give those performance characteristics. Would these be vector inputs and outputs or separate models? Sorry if this is a foolish question, this is all quite new to me. Thanks again for your help and this fantastic article.
    
    Reply
    - Jason Brownlee February 27, 2021 at 6:09 am #
      
      Separate input models would be multiple disjoint vectors as input to the model for example.
      
      You can see models with separate input models here:
      https://machinelearningmastery.com/keras-functional-api-deep-learning/
      
      Yes, you’re describing a single model that takes a vector in and predicts a vector out. You could also model it with a separate “model” for output of each element in the output vector if you want – this might make sense if the elements were unrelated/uncorrelated.
      
      Reply
Peter March 10, 2021 at 4:47 am #

Hi Jason,

I have a question about activation functions in multi-output regression neural networks. I notice you choose the default linear function for the output and ReLU for the hidden layer.

In my case, my outputs can be very small numbers, and often the predictions return negative values which are physically meaningless. How can I restrict the outputs to be positive only? Is a linear output required for regression models, or can I change the output transfer function to tansig, logsig, etc.?

Thanks!

Reply
- Jason Brownlee March 10, 2021 at 6:26 am #
  
  Perhaps you can use a sigmoid activation then scale the output to the desired range.
  
  Reply
Syed Khurram Mahmud March 17, 2021 at 6:42 pm #

Hi! Nice work
Please help me here:

yhat = model.predict(newX)

model is not defined. Rest works fine I think

Why this error

Reply
- Jason Brownlee March 18, 2021 at 5:17 am #
  
  Perhaps you have skipped some lines of code in the example, this will help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
ali March 23, 2021 at 1:43 pm #

Hi Jason,

Great!! thank you so much for your nice tutorial. However, can I know how to develop an explainer for developing an interpretable model such as Permutation Feature Importance in a multi-output regression?

Thank you

Reply
- Jason Brownlee March 24, 2021 at 5:48 am #
  
  Sorry, I cannot help you with model interpretation – I don’t have tutorials on this topic.
  
  Reply
Sepideh April 21, 2021 at 8:20 am #

Hi Jason,

Would you please help me that how I can develop multi output classification neural network?

Thanks

Reply
- Jason Brownlee April 22, 2021 at 5:35 am #
  
  See this example:
  https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/
  
  Reply
Ayan Mitra April 29, 2021 at 6:37 pm #

Hi Jason,
Thanks for this tutorial, helped me on a real project.
I have a related question please.
I have a set of image matrix data set (lets say red, green, blue, white etc.) and the output colour is decided by a set of three input (x, y, z) (numerical values). Therefore the data looks like this (everything numerical, nothing categorical):

Input
X1,Y1,Z1 –> Blue Matrix (flattened 1-D arrays)
X2,Y2,Z2 –> Red Matrix
‘
‘
Xn,Yn,Zn –> White Matrix

Now I want build a NN regression model, like you showed here, where if [X,Y,Z] are input, it predicts the entire colour matrix (/ flattened 1-D array)

now, I understand this output layer will have a shape = shape(1-D colour matrix) (similar to 2 in your example). I also understand that the output layer’s shape is >> input layer’s size.

Can you please give some comments ? is this possible (technically I dont see why not)? Which algorithm you suggest is ideal.

thanks

Reply
- Jason Brownlee April 30, 2021 at 6:03 am #
  
  Perhaps the model can predict a class label and you can write code to then create the matrix – if it is the same for each class label.
  
  Reply
Ayoub AIT IDIR May 17, 2021 at 12:08 am #

Hi,

First of all, thank you for this amazing explanation, it was really helpful.
My question is, can we use RNN or LSTM to predict many output variables because I’ve been searching and found LSTM may be the best option but didn’t found any implementation to start with.

Thank you.

Reply
- Jason Brownlee May 17, 2021 at 5:39 am #
  
  You’re welcome.
  
  Yes, see this:
  https://machinelearningmastery.com/faq/single-faq/how-do-you-use-lstms-for-multi-step-time-series-forecasting
  
  Reply
Maha May 25, 2021 at 11:13 am #

Hello
I want to know how MAE for multi regression output is calculated numerically , if you can mention the mathematical equation for better understanding?
Thanks in advance

Reply
- Jason Brownlee May 26, 2021 at 5:45 am #
  
  Mean of the absolute differences, more here:
  https://scikit-learn.org/stable/modules/model_evaluation.html#mean-absolute-error
  
  Reply
Samaneh Manavi May 26, 2021 at 9:46 pm #

Dear Jason,
Thank you for this nice tutorial.
I have a complex multi-output regression task, in which, the input elements and output elements are not independent. I have designed a network with multi-layers of 1D convolutions by trial and error. I was wondering if you could recommend a well-known network architecture for multi-output regression problems.
I guess it makes more sense to optimize the hyperparameters of a well-known model to my problem rather than designing it from the scratch.

Thank you in advance,
Samaneh

Reply
- Jason Brownlee May 27, 2021 at 5:38 am #
  
  We cannot know what architecture will work well or best for a dataset, I recommend testing a suite of models and model configurations in order to discover what works best for your specific dataset.
  
  Reply
sepideh June 17, 2021 at 3:48 am #

Hi,
I run this code: I have two features(inputs) and 13 output, but I have this error:
—————————————————————————
TypeError Traceback (most recent call last)
in
16 mae = model.evaluate(x_test, y_test, verbose=0)
17 # store result
—> 18 print(‘>%.3f’ % mae)
19 results.append(mae)
20 return results

TypeError: must be real number, not list

######################################
def evaluate_model(x, y):
results = list()
n_inputs, n_outputs = x.shape[1], y.shape[1]
# define evaluation procedure
cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)
# enumerate folds
for train_ix, test_ix in cv.split(x):
# prepare data
x_train, x_test = x[train_ix], x[test_ix]
y_train, y_test = y[train_ix], y[test_ix]
# define model
model = build_regression(n_inputs, n_outputs)
# fit model
model.fit(x_train, y_train, verbose=0, epochs=100)
# evaluate model on test set
mae = model.evaluate(x_test, y_test, verbose=0)
# store result
print(‘>%.3f’ % mae)
results.append(mae)
return results

Reply
- Jason Brownlee June 17, 2021 at 6:19 am #
  
  Sorry to hear that, these tips may help:
  https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  Reply
Jean-Rassaire June 23, 2021 at 1:09 am #

Hi Jason Brownlee,
Thank you for this pedagogically explained model.
I would like to know how I can save this code in a file. Do you have a code to save it?

Reply
- Jason Brownlee June 23, 2021 at 5:39 am #
  
  You’re welcome.
  
  This will help you to save the code:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-copy-code-from-a-tutorial
  
  Reply
Jean-Rassaire June 23, 2021 at 10:30 pm #

Thanks Jason Brownlee, but sorry that was a stupid question.

Reply
- Jason Brownlee June 24, 2021 at 6:02 am #
  
  No problem.
  
  Reply
Gireesh B July 15, 2021 at 4:38 pm #

Hi Jason, Just I want to Train a model with a set of inputs in numbers and set of outputs in numbers without knowing relationship between them(input and output).

Reply
- Jason Brownlee July 16, 2021 at 5:21 am #
  
  Yes, that is the definition of regression problems.
  
  Reply
Amn September 15, 2021 at 3:06 am #

Hi, first of all, thank you for the tutorial! my question is: I have students’ id as a sample and I want to predict 3 grades in the future to evaluate their performance but the problem is that the student id is repeated because they have many homework and many quizzes and so on. how can I predict the grades for each student separately? Then I want to save these predicted grades in a new file with a unique student-id to decide the student performance. Note: the repeated for student-id are different from student to other. any suggestion? Thanks in advance

Reply
- Adrian Tam September 15, 2021 at 11:53 pm #
  
  Student id is a nominal data. This should not be input to the model. I would suggest you to create a vector for each student id, holding the grades of homework and quiz, and predict the final grade. No student id should be in this vector.
  
  Reply
wb September 15, 2021 at 9:34 pm #

Hi, I’ve bought some of your books and they are great. BTW, for multi outputs regression, are there different weights and bias for each output? Or all outputs share the same w and b? In that case, how are the performance of multi outputs regression compared to separate single output regression?

Reply
- Adrian Tam September 16, 2021 at 12:53 am #
  
  Likely different because there is nothing in the model to enforce that they should be the same (but you can construct a model that way if you insist). For performance comparison, I doubt if that is possible because it sounds like you are comparing apples to oranges.
  
  Reply
Amn September 21, 2021 at 2:08 am #

Hi, there is a shared metrics between the classification and regression? I want to compare the performance of regression NN with classification NN?

Reply
- Adrian Tam September 21, 2021 at 9:32 am #
  
  You’re comparing apples to oranges. Note, regression is about predicting continuous values that the numbers have its literal meaning. Classification, however, are predicting nominal values that the numbers at the output are just names. This vast difference causes one metric can make no sense in the wrong problem.
  
  Reply
Paul November 10, 2021 at 2:21 am #

First of all, congratulations on your website and for your detailed explanations and examples. I really enjoy it!

Multiple output regression involves predicting two or more numerical variables.
In the end (# evaluate a model using repeated k-fold cross-validation), on the part of loading dataset, this error apears

ValueError: too many values to unpack (expected 2)

Can you tell me what can be?

Appreciated.

Reply
- Adrian Tam November 14, 2021 at 1:17 pm #
  
  Can’t tell what exactly it is but this is how to trigger this error:
  
  X, y = a, b, c
  
  1
  
  X, y = a, b, c
  
  In python you can do multiple assignments in one line using tuple notation as above. But if the number of elements on either side of the assignment do not match, you have this “too many values to unpack” error.
  
  Reply
Ashiq January 26, 2022 at 9:01 pm #

Can we develop a multi-output regression model for RNN and CNN as well? If yes, do you have any tutorials for that?

Reply
Alice Mani February 7, 2022 at 1:58 am #

Hi Jason, thank you for your useful explanation. I have a problem that is Multi-Output Regression, I think. I want to reconstruct images with MLP. I have 12000 data and create a fully connected layer NN. My loss (MSE) in training and validation is low but the prediction is awful. When I normalized my input data performance became better but the prediction is not good again and not on the scale of ground truth. Could give me some advice, please?

Reply
- James Carmichael February 7, 2022 at 12:12 pm #
  
  Hi Alice…It is possible to overfit the training data.
  
  This means that the model is learning the specific random variations in the training dataset at the cost of poor generalization of the model to new data.
  
  This can be seen by improved skill of the model at making predictions on the training dataset and worse skill of the model at making predictions on the test dataset.
  
  A good general approach to reducing the likelihood of overfitting the training dataset is to use k-fold cross-validation to estimate the skill of the model when making predictions on new data.
  
  A second good general approach in addition to using k-fold cross-validation is to have a hold-out test set that is only used once at the end of your project to help choose between finalized models.
  
  Reply
  - Alice Mani February 8, 2022 at 1:09 am #
    
    Thanks for the reply but unfortunately prediction on training data is not good as well.
    
    Reply
Alice Mani February 8, 2022 at 1:07 am #

Thanks for the reply but prediction on the training dataset is not good as well

Reply
Ashiq February 25, 2022 at 5:11 pm #

How can we develop a multi-output regression model for RNN and CNN? If yes, do you have any tutorials for that?

Reply
Shan March 13, 2022 at 7:24 pm #

How add another hidden layer?

Reply
Shan March 13, 2022 at 7:25 pm #

How to add two more hidden layers?

Reply
- James Carmichael March 14, 2022 at 11:58 am #
  
  Hi Shan…the following resource explains how to add additional layers in multi-layer perceptron models:
  
  https://machinelearningmastery.com/build-multi-layer-perceptron-neural-network-models-keras/
  
  Reply
David March 16, 2022 at 1:28 am #

Hi there!

Great tutorial! Quick question, currently we are getting a single MAE value for all 3 of the predicted values. How can I get 3 MAE values, one for each of the predicted values?

Thanks in advance!

Reply
- James Carmichael March 16, 2022 at 10:44 am #
  
  Hi David…You could capture the MAE each time in a variable before “appending” it
  
  mae = model.evaluate(X_test, y_test, verbose=0) # store result
  print(‘>%.3f’ % mae)
  
  Reply
Marianna April 5, 2022 at 10:47 am #

Hi. I would like to ask you the following.

If we have a regression problem with 2 targets, should we flatten the vector before computing L1Loss (or MAE in keras)?

Reply
- James Carmichael April 6, 2022 at 8:47 am #
  
  Hi Marianna…You may find the following beneficial:
  
  https://moviecultists.com/why-flattening-is-used
  
  Reply
  - Marianna April 8, 2022 at 11:00 pm #
    
    Thank you for answering.
    I found it quite beneficial, but in my 2 target predictions, I want to know i.e. that the prediction p1 is a pair with the prediction p2, in a y.shape [1,2]. So, I think that flattening is not the way to go.
    
    Another question that I would like to ask you though is for this particular example and considering that the normalized targets p1, p2 follow different distributions (normal, power-law), after training my model, I found out that p1 has good predictions, but p2 hasn’t. It seems that these different targets have different losses, and p2 has a bigger loss.
    
    What my next actions should be in order to improve the predictions for the p2 target with the larger loss?
    
    Thank you for your time and consideration
    
    Reply
Putra April 6, 2022 at 12:22 pm #

Hi. i wanna ask question.

i want to predict some variable with the 3 input factors that effect to this variable, how to make the input data from 3 factors?

Reply
Indra April 23, 2022 at 6:55 pm #

Hi Jason,

I want to ask something related to the multi outputs model, so I have build a model with multi output and after the training when I compare single and multi output model, the loss produce from the multi output is very large. What should I do to make the model with multi output to have less loss? is it possible to run the optimizer and loss individually on multi output? because it seems like the loss is decreasing very slow compare with single output.

Reply
- James Carmichael April 24, 2022 at 3:19 am #
  
  Hi Indra,
  
  The following may help clarify:
  
  https://machinelearningmastery.com/feature-selection-to-improve-accuracy-and-decrease-training-time/
  
  https://machinelearningmastery.com/classification-accuracy-is-not-enough-more-performance-measures-you-can-use/
  
  Reply
KayD April 28, 2022 at 6:45 am #

Hi Jason,

Great article, thank you for detailed explanation on the process.

i am looking to build a multi- output regression model for 4 target variables using tabular data as input. What kind of neural network model do you recommend for such a problem?

Additionally, apart from NNs, what other models do you recommend for such as scenario? i wanted to use XGB but according to my understanding it doesn’t support multi target regression models.

Thanks again!

Your comment is awaiting moderation.

Reply
- James Carmichael May 2, 2022 at 9:31 am #
  
  Hello KayD…the following may be of interest to you:
  
  https://machinelearningmastery.com/deep-learning-models-for-multi-output-regression/
  
  https://machinelearningmastery.com/multi-output-regression-models-with-python/
  
  Reply
Shirsendu Mitra May 10, 2022 at 11:01 pm #

Hi Dr. Jason,

Is it possible to develop a machine learning model where there are multiple outputs (say 10 to 20) and those outputs are mutually dependent. I mean to say that for a specific set of inputs one one combination of outputs are expected.

I have training data set where inputs are images and each image corresponds to a specific output vector. Now if I try to use such 1000 images for training set to predict output vector for similar and unknown images.

can you please guide me if there is any way to tackle this kind of problems.

Reply
- James Carmichael May 13, 2022 at 1:23 am #
  
  Hi Shirsendu…The following may be of interest to you:
  
  https://machinelearningmastery.com/multi-output-regression-models-with-python/
  
  Reply
Elias Siskos May 20, 2022 at 4:10 am #

Hello I am trying to develop a N Network that predicts handwriting letters. I have created my dataset and now I want to fit a model.
My perspective is that I have to use multi output regression but not with numerical values. Instead I must use sigmoid activation function in the output layer. Am I correct?
How can I do this ? Can you please help me?

PS I have to congratulate you for this great work you have dove some far. You have probably the best site I’ve ever seen (and I have seen too many).

Reply
- James Carmichael May 20, 2022 at 11:04 pm #
  
  Hi Elias…You may wish to consider CNNs for this task:
  
  https://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/
  
  Reply
Adrian August 9, 2022 at 6:23 pm #

Hello,
I tried to plot two learning curves for the k-fold cross-validation. These are intended to represent the test data and the validation data. Unfortunately I didn’t succeed for def evaluate_model(X, y). Is there a solution for this?
Thank you in advance!

Reply
- James Carmichael August 10, 2022 at 5:38 am #
  
  Hi Adrian…Please clarify what issues you are encountering so that we may better assist you.
  
  Reply
Oscar August 30, 2022 at 6:21 am #

Hi Dr. Jason Brownlee,

I am trying to develop a multi-output regression model (4 inputs, 4 outputs). I have been successful so far with the Neural Network algorithm (4-5-5-4 architecture); it is feedforward. I am using time series data I collected from a physical experiment for the input and the resulting output.

I am now trying to basically cover my tracks and compare results between models developed with other machine learning algorithms that may be applicable and can produce multiple outputs.

Do you have any suggestions for algorithms I could explore?

I see you mention in this post “decision trees and ensembles of decision trees” (random forest?). Are there any others?

Best regards,

OLB

Reply
- James Carmichael August 30, 2022 at 6:34 am #
  
  Hi Oscar…You may wish to investigate sequence to sequence prediction models:
  
  https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-prediction-keras/
  
  Reply
Cuong September 17, 2022 at 12:36 pm #

Hi Dr. Jason Brownlee,

Thank you for the helpful lessons.

When I run your code, it occurred the error: “IndentationError: expected an indented block”, as below:

def evaluate_model(X, y):
results = list()
n_inputs, n_outputs = X.shape[1], y.shape[1]

def evaluate_model(X, y):
Input In [31]
def evaluate_model(X, y):
^
IndentationError: expected an indented block

Please tell me why and how to fix it.

Thank you.

Reply
WDDouble October 5, 2022 at 1:48 pm #

Hi, I have a training dataset whose training data is a curve (x-axis is the frequency) which includes 50,000 points on it and for each curve i have more than 100 conresponding parameters/labels. I would like to train a model so that for curve it can regress those hundreds of parameters.

The first thing I thought of was to use mlp…. Do you have any suggestions?

Thank you

Reply
- James Carmichael October 6, 2022 at 7:12 am #
  
  Hi WDDouble…An MLP would be a great starting point for regression as you described. The following resource may help clarify when to use various model types:
  
  https://machinelearningmastery.com/when-to-use-mlp-cnn-and-rnn-neural-networks/
  
  Reply
Aref October 26, 2022 at 4:53 am #

Dear Jason,

Thanks for sharing such useful information with a clear description. It is very helpful to me.

I have an important question that is related to my project. I have to define 10 weights for every 10 input values so that each input feature has a specified weight, and the loss function should minimize the cost by considering these features’ weights. It means features with the highest weight have more priority for minimizing the error.

Let’s describe more:

Consider we define wights range from 1 to 5 for 10 input features.

input feature weight
1 5
2 3
3 4
4 4
5 3
6 1
7 5
8 2
9 1
10 2

How can I define a custom weighted loss function for weighing each element?

Reply
- James Carmichael October 26, 2022 at 7:06 am #
  
  Hi Aref…The following resource may be of interest:
  
  https://stackoverflow.com/questions/62059060/how-to-set-custom-weights-in-layers
  
  Reply
Aref October 26, 2022 at 5:25 am #

Dear Jason,

I would be thankful if you put code for defining a custom weighted loss function here.

Reply
- James Carmichael October 26, 2022 at 7:04 am #
  
  Thank you for your recommendation Aref!
  
  Reply
Aref October 26, 2022 at 8:24 am #

guys,

I don’t understand why my question disappeared after a few minutes?! It is so annoying.

Reply
- Aref October 26, 2022 at 8:26 am #
  
  Sorry, I could not find my question, but I found it now. It was a bit wired.
  
  Reply
Nasim December 5, 2022 at 1:03 pm #

Hi there. I was wondering how we can formulate the mean squared error for a multi-output regression over a training data set. The mean squared error for a single output regression would be 1/2m(sum(y_p-y)^2) and the sum is over the training data set. Now, how would it be for multi-output regression?

Reply
- James Carmichael December 6, 2022 at 10:03 am #
  
  Hi Nasim…The following resource may be of interest to you:
  
  https://stackoverflow.com/questions/66707594/rmse-for-multidimensional-data
  
  Reply
Med February 5, 2023 at 4:25 am #

Hi Jason,

Can you tell me, once modeled, how can we find the values of input variables to optimize (minimize one, maximize another etc.) multiple output variables?

Cheers,
Med

Reply
- James Carmichael February 5, 2023 at 1:21 pm #
  
  Hi Med…The following resource may be of interest to you:
  
  https://machinelearningmastery.com/tour-of-optimization-algorithms/
  
  Reply
JJ Dela Cruz February 11, 2023 at 8:47 pm #

Hello! I’m currently working on a dataset that involves 2 dependent variables however those dependent variables are also dependent on one another. Will the algorithm still work in this case? Thanks!

Reply
- James Carmichael February 12, 2023 at 9:32 am #
  
  Hi JJ Dela Cruz…I would recommend that you try the model on your scenrio. Please let us know what you find.
  
  Reply
Lally April 29, 2023 at 2:47 am #

hello ! can you help me to build a ResNet 34 model for multioutpout !

Reply
- James Carmichael April 29, 2023 at 10:57 am #
  
  Hi Lally…The following is a great starting point to leverage transfer learning:
  
  https://machinelearningmastery.com/transfer-learning-for-deep-learning/
  
  Reply
Gamuchirai Ndawana June 25, 2023 at 6:31 am #

Hi Jason, do you know that you have a gift of making information more accessible? Do you know that you’re a blessing and a source of inspiration and guidance in a world filled with problems?

I wish you a long life, the value of information you provide for free is unbelievable ❤️

Reply
- James Carmichael June 25, 2023 at 9:07 am #
  
  Thank you for your support and kind words! We are always here to help answer any questions regarding machine learning! We wish you the best on your machine learning journey!
  
  Reply
Carlos July 26, 2023 at 2:30 pm #

Hi. This is an excellent website for machine learning. Congratulations. In the function “def evaluate_model(X, y):” I tried to obtain the training and test datasets by adding these arrays at the Return line but it yields error. I have also printed them in the for loop and observed that they changed with each iteration. Could you please elaborate more on that? Could you please add some plots at the end of the routine ar guide us on how to retrieve information for comparison. i.e. trained vs calculated, test vs calculated, mse, etc. Thank you

Reply
- James Carmichael July 27, 2023 at 9:21 am #
  
  Hi Carlos…Please specify the exact error so that may better assist you.
  
  Reply
  - Carlos July 30, 2023 at 4:38 pm #
    
    Hi James. Thank you for your reply In the line 42 of the above code “return results”. of “def evaluate_model(X, y)” function, I modified it to “return results, X_train, y_train, X_test, y_test”
    
    Reply
Carlos July 30, 2023 at 4:37 pm #

Hi James. Thank you for your reply In the line 42 of the above code “return results”. of “def evaluate_model(X, y)” function, I modified it to “return results, X_train, y_train, X_test, y_test”

Reply
Srivatsa Datta October 21, 2023 at 1:49 am #

Hi James,

I wanted to share my sincere appreciation for a wonderful website. In particular, as an educator, what I found most appealing about your website are the details you provide that explain the “why” behind the choices you provide.
In addition, you also identify that sometimes, we need to use “trial-and-error” as stated in this tutorial to hit upon the “a reasonably correct” specification.
I have been evaluating various websites, to provide additional references for my students to use, to supplement my class notes. I’m happy to say that I have found my go-to-website to refer to my students. I also found that I can refresh my own knowledge and understanding as I go through these wonderful examples you have created.
Thank you for a truly wonderful website of learning you have provided for the benefit of countless students of ML and DL. I also found it truly rewarding to see you take the time to answer the questions from various users here.
Hailing from a developing country, my students cannot afford to purchase the books you have authored. However, I intend to support your endeavor by making a purchase. I’m also, as an avid R user, encouraged and thrilled that you also have a dedicated section for R.

Sincerely,

Vatsa

Reply
- James Carmichael October 21, 2023 at 9:22 am #
  
  Hi Srivatsa…We greatly appreciate your support and feedback! The following location provides a full list of ebooks with detailed descriptions of content within each. Let us know if we can answer any questions as you review our tutorials and ebooks!
  
  https://machinelearningmastery.com/products/
  
  Reply
Srivatsa Datta October 25, 2023 at 2:06 am #

Thank you so much for the link Jason. I will examine the catalog.

Reply
Javad November 21, 2023 at 8:14 am #

Hello,

Thank you for the great article.

I have 1 categorical and 4 continuous numerical outputs. Could you please tell me how I can deal with the categorical target? Should I convert it to numerical output using one-hot encoding or some other method, then use multi-output regression, and then round the numbers for the categorical outputs?
Does it make sense? or is there any other way to train my dataset?

Reply
- James Carmichael November 21, 2023 at 10:35 am #
  
  Hi Javad…You are very welcome! The following resource may be of interest to you.
  
  https://stackoverflow.com/questions/61202580/correct-way-of-one-hot-encoding-class-labels-for-multi-class-problem
  
  Reply
Jim December 8, 2023 at 4:31 am #

On line 29 there’s a for loop but nothing below it is indented. What’s supposed to be indented?

Reply
- James Carmichael December 8, 2023 at 11:11 am #
  
  Hi Jim…Did you type the code in or copy and paste it? Also, what environment are you using (Anaconda, Google Colab?)
  
  Reply
  - Jim December 12, 2023 at 5:39 am #
    
    Hey James,
    
    it fixed itself when I copy and pasted instead of typing.
    
    Reply
    - James Carmichael December 12, 2023 at 11:14 am #
      
      Thank you for the update Jim! Keep up the great work!
      
      Reply
Jim December 12, 2023 at 5:17 am #

How do you place a lower bound on your answers? I’m wanting to put a lower bound of 0.

Reply
- James Carmichael December 12, 2023 at 11:19 am #
  
  Hi Jim…The following discussion may be of interest:
  
  https://stats.stackexchange.com/questions/485865/how-to-bound-a-regressor-function
  
  Reply
Jim December 13, 2023 at 8:13 am #

Hey James,

How would you export the output from this to a dataframe? Specifically, I’m trying to create a csv with the output data.

Thank you so much for your help
Jim

Reply

Navigation

Deep Learning Models for Multi-Output Regression

Tutorial Overview

Multi-Output Regression

Neural Networks for Multi-Outputs

Neural Network for Multi-Output Regression

Further Reading

Summary

More On This Topic

170 Responses to Deep Learning Models for Multi-Output Regression

Leave a Reply Click here to cancel reply.