[New Book] Click to get The Beginner's Guide to Data Science!
Use the offer code 20offearlybird to get 20% off. Hurry, sale ends soon!

Using Singular Value Decomposition to Build a Recommender System

Singular value decomposition is a very popular linear algebra technique to break down a matrix into the product of a few smaller matrices. In fact, it is a technique that has many uses. One example is that we can use SVD to discover relationship between items. A recommender system can be build easily from this.

In this tutorial, we will see how a recommender system can be build just using linear algebra techniques.

After completing this tutorial, you will know:

  • What has singular value decomposition done to a matrix
  • How to interpret the result of singular value decomposition
  • What data a single recommender system require, and how we can make use of SVD to analyze it
  • How we can make use of the result from SVD to make recommendations

Let’s get started.

Using Singular Value Decomposition to Build a Recommender System

Using Singular Value Decomposition to Build a Recommender System
Photo by Roberto Arias, some rights reserved.

Tutorial overview

This tutorial is divided into 3 parts; they are:

  • Review of Singular Value Decomposition
  • The Meaning of Singular Value Decomposition in Recommender System
  • Implementing a Recommender System

Review of Singular Value Decomposition

Just like a number such as 24 can be decomposed as factors 24=2×3×4, a matrix can also be expressed as multiplication of some other matrices. Because matrices are arrays of numbers, they have their own rules of multiplication. Consequently, they have different ways of factorization, or known as decomposition. QR decomposition or LU decomposition are common examples. Another example is singular value decomposition, which has no restriction to the shape or properties of the matrix to be decomposed.

Singular value decomposition assumes a matrix $M$ (for example, a $m\times n$ matrix) is decomposed as
$$
M = U\cdot \Sigma \cdot V^T
$$
where $U$ is a $m\times m$ matrix, $\Sigma$ is a diagonal matrix of $m\times n$, and $V^T$ is a $n\times n$ matrix. The diagonal matrix $\Sigma$ is an interesting one, which it can be non-square but only the entries on the diagonal could be non-zero. The matrices $U$ and $V^T$ are orthonormal matrices. Meaning the columns of $U$ or rows of $V$ are (1) orthogonal to each other and are (2) unit vectors. Vectors are orthogonal to each other if any two vectors’ dot product is zero. A vector is unit vector if its L2-norm is 1. Orthonormal matrix has the property that its transpose is its inverse. In other words, since $U$ is an orthonormal matrix, $U^T = U^{-1}$ or $U\cdot U^T=U^T\cdot U=I$, where $I$ is the identity matrix.

Singular value decomposition gets its name from the diagonal entries on $\Sigma$, which are called the singular values of matrix $M$. They are in fact, the square root of the eigenvalues of matrix $M\cdot M^T$. Just like a number factorized into primes, the singular value decomposition of a matrix reveals a lot about the structure of that matrix.

But actually what described above is called the full SVD. There is another version called reduced SVD or compact SVD. We still have to write $M = U\cdot\Sigma\cdot V^T$ but we have $\Sigma$ a $r\times r$ square diagonal matrix with $r$ the rank of matrix $M$, which is usually less than or equal to the smaller of $m$ and $n$. The matrix $U$ is than a $m\times r$ matrix and $V^T$ is a $r\times n$ matrix. Because matrices $U$ and $V^T$ are non-square, they are called semi-orthonormal, meaning $U^T\cdot U=I$ and $V^T\cdot V=I$, with $I$ in both case a $r\times r$ identity matrix.

The Meaning of Singular Value Decomposition in Recommender System

If the matrix $M$ is rank $r$, than we can prove that the matrices $M\cdot M^T$ and $M^T\cdot M$ are both rank $r$. In singular value decomposition (the reduced SVD), the columns of matrix $U$ are eigenvectors of $M\cdot M^T$ and the rows of matrix $V^T$ are eigenvectors of $M^T\cdot M$. What’s interesting is that $M\cdot M^T$ and $M^T\cdot M$ are potentially in different size (because matrix $M$ can be non-square shape), but they have the same set of eigenvalues, which are the square of values on the diagonal of $\Sigma$.

This is why the result of singular value decomposition can reveal a lot about the matrix $M$.

Imagine we collected some book reviews such that books are columns and people are rows, and the entries are the ratings that a person gave to a book. In that case, $M\cdot M^T$ would be a table of person-to-person which the entries would mean the sum of the ratings one person gave match with another one. Similarly $M^T\cdot M$ would be a table of book-to-book which entries are the sum of the ratings received match with that received by another book. What can be the hidden connection between people and books? That could be the genre, or the author, or something of similar nature.

Implementing a Recommender System

Let’s see how we can make use of the result from SVD to build a recommender system. Firstly, let’s download the dataset from this link (caution: it is 600MB big)

This dataset is the “Social Recommendation Data” from “Recommender Systems and Personalization Datasets“. It contains the reviews given by users on books on Librarything. What we are interested are the number of “stars” a user given to a book.

If we open up this tar file we will see a large file named “reviews.json”. We can extract it, or read the included file on the fly. First three lines of reviews.json are shown below:

The above will print:

Each line in reviews.json is a record. We are going to extract the “user”, “work”, and “stars” field of each record as long as there are no missing data among these three. Despite the name, the records are not well-formed JSON strings (most notably it uses single quote rather than double quote). Therefore, we cannot use json package from Python but to use ast to decode such string:

Now we should make a matrix of how different users rate each book. We make use of the pandas library to help convert the data we collected into a table:

As an example, we try not to use all data in order to save time and memory. Here we consider only those users who reviewed more than 50 books and also those books who are reviewed by more than 50 users. This way, we trimmed our dataset to less than 15% of its original size:

Then we can make use of “pivot table” function in pandas to convert this into a matrix:

The result is a matrix of 5593 rows and 2898 columns


Here we represented 5593 users and 2898 books in a matrix. Then we apply the SVD (this will take a while):

By default, the svd() returns a full singular value decomposition. We choose a reduced version so we can use smaller matrices to save memory. The columns of vh correspond to the books. We can based on vector space model to find which book are most similar to the one we are looking at:

And in the above example, we try to find the book that is best match to to first column. The result is:

In a recommendation system, when a user picked a book, we may show her a few other books that are similar to the one she picked based on the cosine distance as calculated above.

Depends on the dataset, we may use truncated SVD to reduce the dimension of matrix vh. In essence, this means we are removing several rows on vh that the corresponding singular values in s are small, before we use it to compute the similarity. This would likely make the prediction more accurate as those less important features of a book are removed from consideration.

Note that, in the decomposition $M=U\cdot\Sigma\cdot V^T$ we know the rows of $U$ are the users and columns of $V^T$ are books, we cannot identify what are the meanings of the columns of $U$ or rows of $V^T$ (an equivalently, that of $\Sigma$). We know they could be genres, for example, that provide some underlying connections between the users and the books but we cannot be sure what exactly are they. However, this does not stop us from using them as features in our recommendation system.

Tying all together, the following is the complete code:

Further reading

This section provides more resources on the topic if you are looking to go deeper.

Books

APIs

Articles

Summary

In this tutorial, you discovered how to build a recommender system using singular value decomposition.

Specifically, you learned:

  • What a singular value decomposition mean to a matrix
  • How to interpret the result of a singular value decomposition
  • Find similarity from the columns of matrix $V^T$ obtained from singular value decomposition, and make recommendations based on the similarity

Get a Handle on Linear Algebra for Machine Learning!

Linear Algebra for Machine Learning

Develop a working understand of linear algebra

...by writing lines of code in python

Discover how in my new Ebook:
Linear Algebra for Machine Learning

It provides self-study tutorials on topics like:
Vector Norms, Matrix Multiplication, Tensors, Eigendecomposition, SVD, PCA and much more...

Finally Understand the Mathematics of Data

Skip the Academics. Just Results.

See What's Inside

13 Responses to Using Singular Value Decomposition to Build a Recommender System

  1. Avatar
    Jack October 26, 2021 at 1:50 pm #

    import tarfile

    # Read downloaded file from:
    # http://deepyeti.ucsd.edu/jmcauley/datasets/librarything/lthing_data.tar.gz
    with tarfile.open(“lthing_data.tar.gz”) as tar:
    print(“Files in tar archive:”)
    tar.list()

    with tar.extractfile(“lthing_data/reviews.json”) as file:
    count = 0
    for line in file:
    print(line)
    count += 1
    if count > 3:
    break

    Dear Dr. Adrian Tam
    those code cannot run successfully
    can i ask how to fix it?

    • Avatar
      Adrian Tam October 26, 2021 at 2:44 pm #

      What error you have?

      • Avatar
        Jack October 26, 2021 at 2:46 pm #

        File “”, line 18
        if count &gt 3:
        ^
        SyntaxError: invalid syntax

        • Avatar
          Adrian Tam October 26, 2021 at 11:23 pm #

          Thanks, there’s some HTML formatting issue where the &gt should be >. It is fixed now.

    • Avatar
      Yasser January 29, 2024 at 8:00 am #

      Thank you for this post. Would you kindly read it again for clarity? For example, “a table of person-to-person which the entries would mean the sum of the ratings one person gave match with another one” is just not clear.

      Did you mean “a table of person-to-person [vectors for] which the entries [] mean the sum of the ratings one person gave [that] match [] another [] person’s ratings”

      Clarity is really crucial here, at least for me. Otherwise, I cannot follow.

      Thanks again.

  2. Avatar
    Ahmed Ibrahim November 6, 2021 at 4:34 am #

    The post looks awesome ????. A better approach tho is to make a subset of the dataset set yourself and host it on GitHub so others can follow easily.

  3. Avatar
    Fred April 1, 2022 at 10:46 am #

    Why didn’t you post how to evaluate the performance of your system?

    • Avatar
      James Carmichael April 2, 2022 at 12:24 pm #

      Great recommendation!

  4. Avatar
    Arnold November 18, 2022 at 7:16 am #

    Hi Jason – I had to modify some of your script to get this to work. First, when I downloaded the tar file from ucsd.edu the json file was not present but a text file – ‘reviews.txt’.

    Also at least one review had the word ‘stars’ inside but did not have ” ‘stars ‘:” so I modified the statement

    if any(x not in record for x in [‘user’, ‘work’, ‘stars’]):

    to

    if any(x not in record for x in [“‘user’:”, “‘work’:”, “‘stars’:”]):

    and it all worked fine

    Thanks

  5. Avatar
    Antoine Banderas November 24, 2022 at 12:37 pm #

    Hi Jason,
    Using the SVD over a matrix where there are lots of NANs (that is many unread books by a specific person), is a real issue, replacing all these NANs with 0.0 will mean that the users actually read the book and gave them a 0.0 rating… can you further explain that?
    Thanks a lot Jason

  6. Avatar
    Qian July 13, 2023 at 7:10 pm #

    You said “The columns of vh correspond to the books.” This conclusion is a bit fast, why, could you elaborate more?

    • Avatar
      James Carmichael July 14, 2023 at 9:36 am #

      Hi Qian…This can be seen from execution of the code to establish the matrices. Have you been able to execute the sample code?

Leave a Reply