Introduction to Matrices and Matrix Arithmetic for Machine Learning

By Jason Brownlee on October 17, 2021 in Linear Algebra 13

Matrices are a foundational element of linear algebra.

Matrices are used throughout the field of machine learning in the description of algorithms and processes such as the input data variable (X) when training an algorithm.

In this tutorial, you will discover matrices in linear algebra and how to manipulate them in Python.

After completing this tutorial, you will know:

What a matrix is and how to define one in Python with NumPy.
How to perform element-wise operations such as addition, subtraction, and the Hadamard product.
How to multiply matrices together and the intuition behind the operation.

Kick-start your project with my new book Linear Algebra for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Update Jun/2019: Fixed a typo in the Matrix-Vector Multiplication section (thanks M. Vincent).
Update Jun/2019: Fixed typo in description of matrix in Python (thanks Ari).
Update Oct/2021: Added example of using @-operator for matrix-vector multiplication.

A Gentle Introduction to Matrices for Machine Learning
Photo by Maximiliano Kolus, some rights reserved.

Tutorial Overview

This tutorial is divided into 6 parts; they are:

What is a Matrix?
Defining a Matrix
Matrix Arithmetic
Matrix-Matrix Multiplication (Dot Product)
Matrix-Vector Multiplication
Matrix-Scalar Multiplication

Need help with Linear Algebra for Machine Learning?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

What is a Matrix?

A matrix is a two-dimensional array of scalars with one or more columns and one or more rows.

A matrix is a two-dimensional array (a table) of numbers.

— Page 115, No Bullshit Guide To Linear Algebra, 2017

The notation for a matrix is often an uppercase letter, such as A, and entries are referred to by their two-dimensional subscript of row (i) and column (j), such as aij. For example:

A = ((a11, a12), (a21, 22), (a31, a32))

1	A = ((a11, a12), (a21, 22), (a31, a32))

It is more common to see matrices defined using a horizontal notation.

     a11, a12
A = (a21, a22)
     a31, a32

a11, a12

A = (a21, a22)

a31, a32

A likely first place you may encounter a matrix in machine learning is in model training data comprised of many rows and columns and often represented using the capital letter “X”.

The geometric analogy used to help understand vectors and some of their operations does not hold with matrices. Further, a vector itself may be considered a matrix with one column and multiple rows.

Often the dimensions of the matrix are denoted as m and n for the number of rows and the number of columns.

Now that we know what a matrix is, let’s look at defining one in Python.

Defining a Matrix

We can represent a matrix in Python using a two-dimensional NumPy array.

A NumPy array can be constructed given a list of lists. For example, below is a 2 row, 3 column matrix.

# create matrix
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)

# create matrix

from numpy import array

A = array([[1, 2, 3], [4, 5, 6]])

print(A)

Running the example prints the created matrix showing the expected structure.

[[1 2 3]
 [4 5 6]]

1 2	[[1 2 3] [4 5 6]]

Matrix Arithmetic

In this section will demonstrate simple matrix-matrix arithmetic, where all operations are performed element-wise between two matrices of equal size to result in a new matrix with the same size.

Matrix Addition

Two matrices with the same dimensions can be added together to create a new third matrix.

C = A + B

C = A + B

The scalar elements in the resulting matrix are calculated as the addition of the elements in each of the matrices being added.

         a11 + b11, a12 + b12
A + B = (a21 + b21, a22 + b22)
         a31 + b31, a32 + b32

a11 + b11, a12 + b12

A + B = (a21 + b21, a22 + b22)

a31 + b31, a32 + b32

Or, put another way:

C[0,0] = A[0,0] + B[0,0]
C[1,0] = A[1,0] + B[1,0]
C[2,0] = A[2,0] + B[2,0]
C[0,1] = A[0,1] + B[0,1]
C[1,1] = A[1,1] + B[1,1]
C[2,1] = A[2,1] + B[2,1]

C[0,0] = A[0,0] + B[0,0]

C[1,0] = A[1,0] + B[1,0]

C[2,0] = A[2,0] + B[2,0]

C[0,1] = A[0,1] + B[0,1]

C[1,1] = A[1,1] + B[1,1]

C[2,1] = A[2,1] + B[2,1]

We can implement this in python using the plus operator directly on the two NumPy arrays.

# add matrices
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)
B = array([[1, 2, 3], [4, 5, 6]])
print(B)
C = A + B
print(C)

# add matrices

from numpy import array

A = array([[1, 2, 3], [4, 5, 6]])

print(A)

B = array([[1, 2, 3], [4, 5, 6]])

print(B)

C = A + B

print(C)

The example first defines two 2×3 matrices and then adds them together.

Running the example first prints the two parent matrices and then the result of adding them together.

[[1 2 3]
 [4 5 6]]

[[1 2 3]
 [4 5 6]]

[[ 2  4  6]
 [ 8 10 12]]

[[1 2 3]

[4 5 6]]

[[1 2 3]

[4 5 6]]

[[ 2 4 6]

[ 8 10 12]]

Matrix Subtraction

Similarly, one matrix can be subtracted from another matrix with the same dimensions.

C = A - B

C = A - B

The scalar elements in the resulting matrix are calculated as the subtraction of the elements in each of the matrices.

         a11 - b11, a12 - b12
A - B = (a21 - b21, a22 - b22)
         a31 - b31, a32 - b32

a11 - b11, a12 - b12

A - B = (a21 - b21, a22 - b22)

a31 - b31, a32 - b32

Or, put another way:

C[0,0] = A[0,0] - B[0,0]
C[1,0] = A[1,0] - B[1,0]
C[2,0] = A[2,0] - B[2,0]
C[0,1] = A[0,1] - B[0,1]
C[1,1] = A[1,1] - B[1,1]
C[2,1] = A[2,1] - B[2,1]

C[0,0] = A[0,0] - B[0,0]

C[1,0] = A[1,0] - B[1,0]

C[2,0] = A[2,0] - B[2,0]

C[0,1] = A[0,1] - B[0,1]

C[1,1] = A[1,1] - B[1,1]

C[2,1] = A[2,1] - B[2,1]

We can implement this in python using the minus operator directly on the two NumPy arrays.

# subtract matrices
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)
B = array([[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]])
print(B)
C = A - B
print(C)

# subtract matrices

from numpy import array

A = array([[1, 2, 3], [4, 5, 6]])

print(A)

B = array([[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]])

print(B)

C = A - B

print(C)

The example first defines two 2×3 matrices and then subtracts one from the other.

Running the example first prints the two parent matrices and then subtracts the first matrix from the second.

[[1 2 3]
 [4 5 6]]

[[ 0.5  0.5  0.5]
 [ 0.5  0.5  0.5]]

[[ 0.5  1.5  2.5]
 [ 3.5  4.5  5.5]]

[[1 2 3]

[4 5 6]]

[[ 0.5 0.5 0.5]

[ 0.5 0.5 0.5]]

[[ 0.5 1.5 2.5]

[ 3.5 4.5 5.5]]

Matrix Multiplication (Hadamard Product)

Two matrices with the same size can be multiplied together, and this is often called element-wise matrix multiplication or the Hadamard product.

It is not the typical operation meant when referring to matrix multiplication, therefore a different operator is often used, such as a circle “o”.

C = A o B

C = A o B

As with element-wise subtraction and addition, element-wise multiplication involves the multiplication of elements from each parent matrix to calculate the values in the new matrix.

         a11 * b11, a12 * b12
A o B = (a21 * b21, a22 * b22)
         a31 * b31, a32 * b32

a11 * b11, a12 * b12

A o B = (a21 * b21, a22 * b22)

a31 * b31, a32 * b32

Or, put another way:

C[0,0] = A[0,0] * B[0,0]
C[1,0] = A[1,0] * B[1,0]
C[2,0] = A[2,0] * B[2,0]
C[0,1] = A[0,1] * B[0,1]
C[1,1] = A[1,1] * B[1,1]
C[2,1] = A[2,1] * B[2,1]

C[0,0] = A[0,0] * B[0,0]

C[1,0] = A[1,0] * B[1,0]

C[2,0] = A[2,0] * B[2,0]

C[0,1] = A[0,1] * B[0,1]

C[1,1] = A[1,1] * B[1,1]

C[2,1] = A[2,1] * B[2,1]

We can implement this in python using the star operator directly on the two NumPy arrays.

# element-wise multiply matrices
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)
B = array([[1, 2, 3], [4, 5, 6]])
print(B)
C = A * B
print(C)

# element-wise multiply matrices

from numpy import array

A = array([[1, 2, 3], [4, 5, 6]])

print(A)

B = array([[1, 2, 3], [4, 5, 6]])

print(B)

C = A * B

print(C)

The example first defines two 2×3 matrices and then multiplies them together.

Running the example first prints the two parent matrices and then the result of multiplying them together with a Hadamard Product.

[[1 2 3]
 [4 5 6]]

[[1 2 3]
 [4 5 6]]

[[ 1  4  9]
 [16 25 36]]

[[1 2 3]

[4 5 6]]

[[1 2 3]

[4 5 6]]

[[ 1 4 9]

[16 25 36]]

Matrix Division

One matrix can be divided by another matrix with the same dimensions.

C = A / B

C = A / B

The scalar elements in the resulting matrix are calculated as the division of the elements in each of the matrices.

         a11 / b11, a12 / b12
A / B = (a21 / b21, a22 / b22)
         a31 / b31, a32 / b32

a11 / b11, a12 / b12

A / B = (a21 / b21, a22 / b22)

a31 / b31, a32 / b32

Or, put another way:

C[0,0] = A[0,0] / B[0,0]
C[1,0] = A[1,0] / B[1,0]
C[2,0] = A[2,0] / B[2,0]
C[0,1] = A[0,1] / B[0,1]
C[1,1] = A[1,1] / B[1,1]
C[2,1] = A[2,1] / B[2,1]

C[0,0] = A[0,0] / B[0,0]

C[1,0] = A[1,0] / B[1,0]

C[2,0] = A[2,0] / B[2,0]

C[0,1] = A[0,1] / B[0,1]

C[1,1] = A[1,1] / B[1,1]

C[2,1] = A[2,1] / B[2,1]

We can implement this in python using the division operator directly on the two NumPy arrays.

# divide matrices
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)
B = array([[1, 2, 3], [4, 5, 6]])
print(B)
C = A / B
print(C)

# divide matrices

from numpy import array

A = array([[1, 2, 3], [4, 5, 6]])

print(A)

B = array([[1, 2, 3], [4, 5, 6]])

print(B)

C = A / B

print(C)

The example first defines two 2×3 matrices and then divides the first from the second matrix.

Running the example first prints the two parent matrices and then divides the first matrix by the second.

[[1 2 3]
 [4 5 6]]

[[1 2 3]
 [4 5 6]]

[[ 1.  1.  1.]
 [ 1.  1.  1.]]

[[1 2 3]

[4 5 6]]

[[1 2 3]

[4 5 6]]

[[ 1. 1. 1.]

[ 1. 1. 1.]]

Matrix-Matrix Multiplication (Dot Product)

Matrix multiplication, also called the matrix dot product is more complicated than the previous operations and involves a rule as not all matrices can be multiplied together.

C = A * B

C = A * B

C = AB

C = AB

The rule for matrix multiplication is as follows:

The number of columns (n) in the first matrix (A) must equal the number of rows (m) in the second matrix (B).

For example, matrix A has the dimensions m rows and n columns and matrix B has the dimensions n and k. The n columns in A and n rows b are equal. The result is a new matrix with m rows and k columns.

C(m,k) = A(m,n) * B(n,k)

1	C(m,k) = A(m,n) * B(n,k)

This rule applies for a chain of matrix multiplications where the number of columns in one matrix in the chain must match the number of rows in the following matrix in the chain.

One of the most important operations involving matrices is multiplication of two matrices. The matrix product of matrices A and B is a third matrix C. In order for this product to be defined, A must have the same number of columns as B has rows. If A is of shape m × n and B is of shape n × p, then C is of shape m × p.

— Page 34, Deep Learning, 2016.

The intuition for the matrix multiplication is that we are calculating the dot product between each row in matrix A with each column in matrix B. For example, we can step down rows of column A and multiply each with column 1 in B to give the scalar values in column 1 of C.

This is made clear with the following image.

Depiction of matrix multiplication, taken from Wikipedia, some rights reserved.

Below describes the matrix multiplication using matrix notation.

     a11, a12
A = (a21, a22)
     a31, a32

     b11, b12
B = (b21, b22)

     a11 * b11 + a12 * b21, a11 * b12 + a12 * b22
C = (a21 * b11 + a22 * b21, a21 * b12 + a22 * b22)
     a31 * b11 + a32 * b21, a31 * b12 + a32 * b22

a11, a12

A = (a21, a22)

a31, a32

b11, b12

B = (b21, b22)

a11 * b11 + a12 * b21, a11 * b12 + a12 * b22

C = (a21 * b11 + a22 * b21, a21 * b12 + a22 * b22)

a31 * b11 + a32 * b21, a31 * b12 + a32 * b22

This can be simplified by removing the multiplication signs as:

     a11b11 + a12b21, a11b12 + a12b22
C = (a21b11 + a22b21, a21b12 + a22b22)
     a31b11 + a32b21, a31b12 + a32b22

a11b11 + a12b21, a11b12 + a12b22

C = (a21b11 + a22b21, a21b12 + a22b22)

a31b11 + a32b21, a31b12 + a32b22

We can describe the matrix multiplication operation using array notation.

C[0,0] = A[0,0] * B[0,0] + A[0,1] * B[1,0]
C[1,0] = A[1,0] * B[0,0] + A[1,1] * B[1,0]
C[2,0] = A[2,0] * B[0,0] + A[2,1] * B[1,0]
C[0,1] = A[0,0] * B[0,1] + A[0,1] * B[1,1]
C[1,1] = A[1,0] * B[0,1] + A[1,1] * B[1,1]
C[2,1] = A[2,0] * B[0,1] + A[2,1] * B[1,1]

C[0,0] = A[0,0] * B[0,0] + A[0,1] * B[1,0]

C[1,0] = A[1,0] * B[0,0] + A[1,1] * B[1,0]

C[2,0] = A[2,0] * B[0,0] + A[2,1] * B[1,0]

C[0,1] = A[0,0] * B[0,1] + A[0,1] * B[1,1]

C[1,1] = A[1,0] * B[0,1] + A[1,1] * B[1,1]

C[2,1] = A[2,0] * B[0,1] + A[2,1] * B[1,1]

The matrix multiplication operation can be implemented in NumPy using the dot() function.

# matrix dot product
from numpy import array
A = array([[1, 2], [3, 4], [5, 6]])
print(A)
B = array([[1, 2], [3, 4]])
print(B)
C = A.dot(B)
print(C)

# matrix dot product

from numpy import array

A = array([[1, 2], [3, 4], [5, 6]])

print(A)

B = array([[1, 2], [3, 4]])

print(B)

C = A.dot(B)

print(C)

The example first defines two 2×3 matrices and then calculates their dot product.

Running the example first prints the two parent matrices and then the result of the dot product.

[[1 2]
 [3 4]
 [5 6]]

[[1 2]
 [3 4]]

[[ 7 10]
 [15 22]
 [23 34]]

[[1 2]

[3 4]

[5 6]]

[[1 2]

[3 4]]

[[ 7 10]

[15 22]

[23 34]]

Matrix-Vector Multiplication

A matrix and a vector can be multiplied together as long as the rule of matrix multiplication is observed.

Specifically, that the number of columns in the matrix must equal the number of items in the vector. As with matrix multiplication, the operation can be written using the dot notation. Because the vector only has one column, the result is always a vector.

c = A . v

c = A . v

Or without the dot in a compact form.

c = Av

c = Av

The result is a vector with the same number of rows as the parent matrix.

     a11, a12
A = (a21, a22)
     a31, a32

     v1
v = (v2)

     a11 * v1 + a12 * v2
c = (a21 * v1 + a22 * v2)
     a31 * v1 + a32 * v2

a11, a12

A = (a21, a22)

a31, a32

v = (v2)

a11 * v1 + a12 * v2

c = (a21 * v1 + a22 * v2)

a31 * v1 + a32 * v2

Or, more compactly.

     a11v1 + a12v2
c = (a21v1 + a22v2)
     a31v1 + a32v2

a11v1 + a12v2

c = (a21v1 + a22v2)

a31v1 + a32v2

The matrix-vector multiplication can be implemented in NumPy using the dot() function or the @ operator (since Python version 3.5).

# matrix-vector multiplication
from numpy import array
A = array([[1, 2], [3, 4], [5, 6]])
print(A)
B = array([0.5, 0.5])
print(B)
C = A.dot(B)
print(C)
D = A @ B
print(D)

# matrix-vector multiplication

from numpy import array

A = array([[1, 2], [3, 4], [5, 6]])

print(A)

B = array([0.5, 0.5])

print(B)

C = A.dot(B)

print(C)

D = A @ B

print(D)

The example first defines a 3×2 matrix and a 2 element vector and then multiplies them together.

Running the example first prints the parent matrix and vector and then the result of multiplying them together.

[[1 2]
 [3 4]
 [5 6]]

[ 0.5  0.5]

[ 1.5  3.5  5.5]

[ 1.5  3.5  5.5]

[[1 2]

[3 4]

[5 6]]

[ 0.5 0.5]

[ 1.5 3.5 5.5]

Matrix-Scalar Multiplication

A matrix can be multiplied by a scalar.

This can be represented using the dot notation between the matrix and the scalar.

C = A . b

C = A . b

Or without the dot notation.

C = Ab

C = Ab

The result is a matrix with the same size as the parent matrix where each element of the matrix is multiplied by the scalar value.

     a11, a12
A = (a21, a22)
     a31, a32

b

     a11 * b, a12 * b
C = (a21 * b, a22 * b)
     a31 * b, a32 * b

a11, a12

A = (a21, a22)

a31, a32

a11 * b, a12 * b

C = (a21 * b, a22 * b)

a31 * b, a32 * b

     a11b, a12b
C = (a21b, a22b)
     a31b, a32b

a11b, a12b

C = (a21b, a22b)

a31b, a32b

We can also represent this with array notation.

C[0,0] = A[0,0] * b
C[1,0] = A[1,0] * b
C[2,0] = A[2,0] * b
C[0,1] = A[0,1] * b
C[1,1] = A[1,1] * b
C[2,1] = A[2,1] * b

C[0,0] = A[0,0] * b

C[1,0] = A[1,0] * b

C[2,0] = A[2,0] * b

C[0,1] = A[0,1] * b

C[1,1] = A[1,1] * b

C[2,1] = A[2,1] * b

This can be implemented directly in NumPy with the multiplication operator.

# matrix-scalar multiplication
from numpy import array
A = array([[1, 2], [3, 4], [5, 6]])
print(A)
b = 0.5
print(b)
C = A * b
print(C)

# matrix-scalar multiplication

from numpy import array

A = array([[1, 2], [3, 4], [5, 6]])

print(A)

b = 0.5

print(b)

C = A * b

print(C)

The example first defines a 2×3 matrix and a scalar and then multiplies them together.

Running the example first prints the parent matrix and scalar and then the result of multiplying them together.

[[1 2]
 [3 4]
 [5 6]]

0.5

[[ 0.5  1. ]
 [ 1.5  2. ]
 [ 2.5  3. ]]

[[1 2]

[3 4]

[5 6]]

0.5

[[ 0.5 1. ]

[ 1.5 2. ]

[ 2.5 3. ]]

Extensions

This section lists some ideas for extending the tutorial that you may wish to explore.

Create 5 examples using each operation using your own data.
Implement each matrix operation manually for matrices defined as lists of lists.
Search machine learning papers and find 1 example of each operation being used.

If you explore any of these extensions, I’d love to know.

Summary

In this tutorial, you discovered matrices in linear algebra and how to manipulate them in Python.

Specifically, you learned:

What a matrix is and how to define one in Python with NumPy.
How to perform element-wise operations such as addition, subtraction, and the Hadamard product.
How to multiply matrices together and the intuition behind the operation.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

13 Responses to Introduction to Matrices and Matrix Arithmetic for Machine Learning

Erick February 7, 2018 at 8:16 pm #

If someone wants to *understand* what matrices actually do instead of applying them in Python, I truly recommend watching the YouTube channel “Linear Algebra” by 3Blue1Brown. Very insightful, requires very minimal knowledge of vectors and matrices.

- Jason Brownlee February 8, 2018 at 8:24 am #
  
  Thanks Erick.
  
- Russell February 17, 2018 at 10:07 am #
  
  Thanks Erick. Great resource!!
  
- samir khan July 19, 2023 at 10:11 pm #
  
  thanks Erick
  
Russell Bigley February 16, 2018 at 12:26 pm #

I think there is an error in the code. The Matrix vector Multiplication

a11 * v1 + a12 + v2 c = (a21 * v1 + a22 + v2) a31 * v1 + a32 + v2

1
2
3

     a11 * v1 + a12 + v2
c = (a21 * v1 + a22 + v2)
     a31 * v1 + a32 + v2

should be

a11 * v1 + a12 * v2 c = (a21 * v1 + a22 * v2) a31 * v1 + a32 * v2

1
2
3

     a11 * v1 + a12 * v2
c = (a21 * v1 + a22 * v2)
     a31 * v1 + a32 * v2

- Jason Brownlee February 16, 2018 at 2:59 pm #
  
  Right on! Fixed, thanks Russell.
  
meee June 7, 2018 at 9:16 am #

Hi sir can you please guide me. i want to learn absolute algebra. I will be waiting for your reply. Thanks a lot.

- Jason Brownlee June 8, 2018 at 6:03 am #
  
  You can get started here:
  https://machinelearningmastery.com/start-here/#linear_algebra
  
M. Vincent June 11, 2019 at 8:46 pm #

In the Matrix-Vector multiplication section you state:

“The example first defines a 2×3 matrix and a 2 element vector and then multiplies them together.”

Your example has a 3×2 matrix, and a 2 element row vector.

- Jason Brownlee June 12, 2019 at 7:58 am #
  
  Fixed, thanks!
  
Ari June 26, 2019 at 11:34 am #

In your very first example you state:

A NumPy array can be constructed given a list of lists. For example, below is a 3 row, 2 column matrix.

# create matrix
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)
Running the example prints the created matrix showing the expected structure.

[[1 2 3]
[4 5 6]]

This a 2 row, 3 column matrix — correct?

- Jason Brownlee June 26, 2019 at 2:37 pm #
  
  Correct, I have updated the example. Thanks.
  
  - Ari June 28, 2019 at 11:51 am #
    
    Cheers!

Navigation

Introduction to Matrices and Matrix Arithmetic for Machine Learning

Tutorial Overview

Need help with Linear Algebra for Machine Learning?

What is a Matrix?

Defining a Matrix

Matrix Arithmetic

Matrix Addition

Matrix Subtraction

Matrix Multiplication (Hadamard Product)

Matrix Division

Matrix-Matrix Multiplication (Dot Product)

Matrix-Vector Multiplication

Matrix-Scalar Multiplication

Extensions

More Tutorials

Further Reading

Books

API

Articles

Summary

Get a Handle on Linear Algebra for Machine Learning!

Develop a working understand of linear algebra

Finally Understand the Mathematics of Data

More On This Topic

13 Responses to Introduction to Matrices and Matrix Arithmetic for Machine Learning

Leave a Reply Click here to cancel reply.