Last Updated on

Arrays with different sizes cannot be added, subtracted, or generally be used in arithmetic.

A way to overcome this is to duplicate the smaller array so that it is the dimensionality and size as the larger array. This is called array broadcasting and is available in NumPy when performing array arithmetic, which can greatly reduce and simplify your code.

In this tutorial, you will discover the concept of array broadcasting and how to implement it in NumPy.

After completing this tutorial, you will know:

- The problem of arithmetic with arrays with different sizes.
- The solution of broadcasting and common examples in one and two dimensions.
- The rule of array broadcasting and when broadcasting fails.

Discover vectors, matrices, tensors, matrix types, matrix factorization, PCA, SVD and much more in my new book, with 19 step-by-step tutorials and full source code.

Let’s get started.

## Tutorial Overview

This tutorial is divided into 4 parts; they are:

- Limitation with Array Arithmetic
- Array Broadcasting
- Broadcasting in NumPy
- Limitations of Broadcasting

### Need help with Linear Algebra for Machine Learning?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

## Limitation with Array Arithmetic

You can perform arithmetic directly on NumPy arrays, such as addition and subtraction.

For example, two arrays can be added together to create a new array where the values at each index are added together.

For example, an array a can be defined as [1, 2, 3] and array b can be defined as [1, 2, 3] and adding together will result in a new array with the values [2, 4, 6].

1 2 3 4 |
a = [1, 2, 3] b = [1, 2, 3] c = a + b c = [1 + 1, 2 + 2, 3 + 3] |

Strictly, arithmetic may only be performed on arrays that have the same dimensions and dimensions with the same size.

This means that a one-dimensional array with the length of 10 can only perform arithmetic with another one-dimensional array with the length 10.

This limitation on array arithmetic is quite limiting indeed. Thankfully, NumPy provides a built-in workaround to allow arithmetic between arrays with differing sizes.

## Array Broadcasting

Broadcasting is the name given to the method that NumPy uses to allow array arithmetic between arrays with a different shape or size.

Although the technique was developed for NumPy, it has also been adopted more broadly in other numerical computational libraries, such as Theano, TensorFlow, and Octave.

Broadcasting solves the problem of arithmetic between arrays of differing shapes by in effect replicating the smaller array along the last mismatched dimension.

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.

— Broadcasting, SciPy.org

NumPy does not actually duplicate the smaller array; instead, it makes memory and computationally efficient use of existing structures in memory that in effect achieve the same result.

The concept has also permeated linear algebra notation to simplify the explanation of simple operations.

In the context of deep learning, we also use some less conventional notation. We allow the addition of matrix and a vector, yielding another matrix: C = A + b, where Ci,j = Ai,j + bj. In other words, the vector b is added to each row of the matrix. This shorthand eliminates the need to define a matrix with b copied into each row before doing the addition. This implicit copying of b to many locations is called broadcasting.

— Page 34, Deep Learning, 2016.

## Broadcasting in NumPy

We can make broadcasting concrete by looking at three examples in NumPy.

The examples in this section are not exhaustive, but instead are common to the types of broadcasting you may see or implement.

### Scalar and One-Dimensional Array

A single value or scalar can be used in arithmetic with a one-dimensional array.

For example, we can imagine a one-dimensional array “a” with three values [a1, a2, a3] added to a scalar “b”.

1 2 |
a = [a1, a2, a3] b |

The scalar will need to be broadcast across the one-dimensional array by duplicating the value it 2 more times.

1 |
b = [b1, b2, b3] |

The two one-dimensional arrays can then be added directly.

1 2 |
c = a + b c = [a1 + b1, a2 + b2, a3 + b3] |

The example below demonstrates this in NumPy.

1 2 3 4 5 6 7 8 |
# scalar and one-dimensional from numpy import array a = array([1, 2, 3]) print(a) b = 2 print(b) c = a + b print(c) |

Running the example first prints the defined one-dimensional array, then the scalar, followed by the result where the scalar is added to each value in the array.

1 2 3 4 5 |
[1 2 3] 2 [3 4 5] |

### Scalar and Two-Dimensional Array

A scalar value can be used in arithmetic with a two-dimensional array.

For example, we can imagine a two-dimensional array “A” with 2 rows and 3 columns added to the scalar “b”.

1 2 3 4 |
a11, a12, a13 A = (a21, a22, a23) b |

The scalar will need to be broadcast across each row of the two-dimensional array by duplicating it 5 more times.

1 2 |
b11, b12, b13 B = (b21, b22, b23) |

The two two-dimensional arrays can then be added directly.

1 2 3 4 |
C = A + B a11 + b11, a12 + b12, a13 + b13 C = (a21 + b21, a22 + b22, a23 + b23) |

The example below demonstrates this in NumPy.

1 2 3 4 5 6 7 8 |
# scalar and two-dimensional from numpy import array A = array([[1, 2, 3], [1, 2, 3]]) print(A) b = 2 print(b) C = A + b print(C) |

Running the example first prints the defined two-dimensional array, then the scalar, then the result of the addition with the value “2” added to each value in the array.

1 2 3 4 5 6 7 |
[[1 2 3] [1 2 3]] 2 [[3 4 5] [3 4 5]] |

### One-Dimensional and Two-Dimensional Arrays

A one-dimensional array can be used in arithmetic with a two-dimensional array.

For example, we can imagine a two-dimensional array “A” with 2 rows and 3 columns added to a one-dimensional array “b” with 3 values.

1 2 3 4 |
a11, a12, a13 A = (a21, a22, a23) b = (b1, b2, b3) |

The one-dimensional array is broadcast across each row of the two-dimensional array by creating a second copy to result in a new two-dimensional array “B”.

1 2 |
b11, b12, b13 B = (b21, b22, b23) |

The two two-dimensional arrays can then be added directly.

1 2 3 4 |
C = A + B a11 + b11, a12 + b12, a13 + b13 C = (a21 + b21, a22 + b22, a23 + b23) |

Below is a worked example in NumPy.

1 2 3 4 5 6 7 8 |
# one-dimensional and two-dimensional from numpy import array A = array([[1, 2, 3], [1, 2, 3]]) print(A) b = array([1, 2, 3]) print(b) C = A + b print(C) |

Running the example first prints the defined two-dimensional array, then the defined one-dimensional array, followed by the result C where in effect each value in the two-dimensional array is doubled.

1 2 3 4 5 6 7 |
[[1 2 3] [1 2 3]] [1 2 3] [[2 4 6] [2 4 6]] |

## Limitations of Broadcasting

Broadcasting is a handy shortcut that proves very useful in practice when working with NumPy arrays.

That being said, it does not work for all cases, and in fact imposes a strict rule that must be satisfied for broadcasting to be performed.

Arithmetic, including broadcasting, can only be performed when the shape of each dimension in the arrays are equal or one has the dimension size of 1. The dimensions are considered in reverse order, starting with the trailing dimension; for example, looking at columns before rows in a two-dimensional case.

This make more sense when we consider that NumPy will in effect pad missing dimensions with a size of “1” when comparing arrays.

Therefore, the comparison between a two-dimensional array “A” with 2 rows and 3 columns and a vector “b” with 3 elements:

1 2 |
A.shape = (2 x 3) b.shape = (3) |

In effect, this becomes a comparison between:

1 2 |
A.shape = (2 x 3) b.shape = (1 x 3) |

This same notion applies to the comparison between a scalar that is treated as an array with the required number of dimensions:

1 2 |
A.shape = (2 x 3) b.shape = (1) |

This becomes a comparison between:

1 2 |
A.shape = (2 x 3) b.shape = (1 x 1) |

When the comparison fails, the broadcast cannot be performed, and an error is raised.

The example below attempts to broadcast a two-element array to a 2 x 3 array. This comparison is in effect:

1 2 |
A.shape = (2 x 3) b.shape = (1 x 2) |

We can see that the last dimensions (columns) do not match and we would expect the broadcast to fail.

The example below demonstrates this in NumPy.

1 2 3 4 5 6 7 8 |
# broadcasting error from numpy import array A = array([[1, 2, 3], [1, 2, 3]]) print(A.shape) b = array([1, 2]) print(b.shape) C = A + b print(C) |

Running the example first prints the shapes of the arrays then raises an error when attempting to broadcast, as we expected.

1 2 3 |
(2, 3) (2,) ValueError: operands could not be broadcast together with shapes (2,3) (2,) |

## Extensions

This section lists some ideas for extending the tutorial that you may wish to explore.

- Create three new and different examples of broadcasting with NumPy arrays.
- Implement your own broadcasting function for manually broadcasting in one and two-dimensional cases.
- Benchmark NumPy broadcasting and your own custom broadcasting functions with one and two dimensional cases with very large arrays.

If you explore any of these extensions, I’d love to know.

## Further Reading

This section provides more resources on the topic if you are looking to go deeper.

### Books

- Chapter 2, Deep Learning, 2016.

### Articles

- Broadcasting, NumPy API, SciPy.org
- Broadcasting semantics in TensorFlow
- Array Broadcasting in numpy, EricsBroadcastingDoc
- Broadcasting, Theano
- Broadcasting arrays in Numpy, 2015.
- Broadcasting in Octave

## Summary

In this tutorial, you discovered the concept of array broadcasting and how to implement in NumPy.

Specifically, you learned:

- The problem of arithmetic with arrays with different sizes.
- The solution of broadcasting and common examples in one and two dimensions.
- The rule of array broadcasting and when broadcasting fails.

Do you have any questions?

Ask your questions in the comments below and I will do my best to answer.

thanks a lot, i get it

You’re welcome.

Thank you Jason, your articles are very helpful.

I’m happy to hear that Simon.

Thank you for the nice tutorial!

I got confused by the case that generates an error:

A.shape = (2 x 3)

b.shape = (1)

because earlier you had shown the addition of a 2 x 3 matrix and a scalar which worked, and it seems to me the (1)-array could be treated like a scalar for broadcasting purposes.

Is there a clear reason why this case is disallowed?

Just for the fun of it, here’s a way to do broadcasting, different from the NumPy way you describe, that would work more uniformly for both scalars and arrays:

Given two arrays A and B of shapes (r_1,…,r_n) and (s_1, …, s_m) respectively, where n is the number of dimensions of A and m is the number of dimensions of B. We can assume m < n, that is, B to have less dimensions than A, because if this is not true, then we could simply switch them and consider A and B to be B and A instead.

Note that the above applies even to scalars because scalars have 0 dimensions. So, if A is a scalar, then n is 0 and (r_1,…r_n) is just the empty tuple (), and if B is a scalar then m is 0 and (s_1,…,s_m) is the empty tuple ().

First, we imagine (because it is not really created in memory) a new array B' with n dimensions (1, …, 1, s_1, …, s_m), that is, the dimensions of B are prepended with 1's enough to reach n dimensions.

Second, we check if broadcasting works. For it to work, we compare each pair of corresponding dimensions of A and B'. This is where things become different from what you described, because now we only require that each pair is either identical or at least one of them is 1.

Let us see if this works in the cases I mentioned above. For the case (2 x 3) + (1), B' has dimensions (1 x 1) (prepended one "1" in order to fill to two dimensions like (2 x 3)). Then the first dimensions (2 for A and 1 for B') satisfy the condition, and the second dimensions (3 for A and 1 for B') also satisfy the condition.

Let's check for (2 x 3) and scalar. Because a scalar is 0-dimensional, B' will have dimensions (1 x 1). This also satisfies the condition because each pair of corresponding dimensions has at least one 1.

Now, we actually compute the result. We perform the operation element-by-element. The result C will have n dimensions and, given indices i_1,…, i_n, the element C_(i_1,…,i_n) is defined by:

C_(i_1,…,i_n) = A_(i_1,…,i_n) + B_(i_(n – m +1),…,i_n)

which is to say, we use the same elements of B for all values of i_1, …, i_(n – m) (they are broadcast).

This method would produce the same result for both (2 x 3) + (1) and (2 x 3) + scalar, which seems to me to make more sense.

Not sure why they didn't define it in this more general (and, in my opinion, simpler) way. I am interested in understanding why. And, even if there is no particular reason, I think it helps me understanding NumPy's way of doing it if I understand how it could be done more generally in principle and reminding myself that they just treat the scalar case different.

I think you may have confused the cases.

The final error in the post is for two vectors with differing dimensions, not a vector and a scaler.

I was not referring to the final error in the post, I was referring to the following one:

This same notion applies to the comparison between a scalar that is treated as an array with the required number of dimensions:

A.shape = (2 x 3)

b.shape = (1)

1

2

A.shape = (2 x 3)

b.shape = (1)

This becomes a comparison between:

A.shape = (2 x 3)

b.shape = (1 x 1)

1

2

A.shape = (2 x 3)

b.shape = (1 x 1)

When the comparison fails, the broadcast cannot be performed, and an error is raised.

I see, thanks.

The rationale may simple consistency with the API, e.g. arrays must have matching dimensionality.

See also numpy’s outer product for a related notion.

Thanks Paul, can you elaborate?