The chain rule is an important derivative rule that allows us to work with composite functions. It is essential in understanding the workings of the backpropagation algorithm, which applies the chain rule extensively in order to calculate the error gradient of the loss function with respect to each weight of a neural network. We will […]

# Tag Archives | multivariate

## The Chain Rule of Calculus for Univariate and Multivariate Functions

The chain rule allows us to find the derivative of composite functions. It is computed extensively by the backpropagation algorithm, in order to train feedforward neural networks. By applying the chain rule in an efficient manner while following a specific order of operations, the backpropagation algorithm calculates the error gradient of the loss function with […]

## A Gentle Introduction to the Jacobian

In the literature, the term Jacobian is often interchangeably used to refer to both the Jacobian matrix or its determinant. Both the matrix and the determinant have useful and important applications: in machine learning, the Jacobian matrix aggregates the partial derivatives that are necessary for backpropagation; the determinant is useful in the process of changing […]

## Higher-Order Derivatives

Higher-order derivatives can capture information about a function that first-order derivatives on their own cannot capture. First-order derivatives can capture important information, such as the rate of change, but on their own they cannot distinguish between local minima or maxima, where the rate of change is zero for both. Several optimization algorithms address this limitation […]