Recommendation Systems
- writedsaistories
- Dec 18, 2023
- 4 min read
An introduction to
User-based Collaborative Filtering,
Item-based Collaborative Filtering and
Latent Factor Approach--Matrix Factorization
(And a place to consolidate my thoughts and document my learning.)
Code implementation of these 3 methods on the Books Crossing dataset was done: https://github.com/solarspaceclouds/Recommendation_System_Book_Crossings_Dataset
Recommendation Systems are needed because we are currently spoilt for choice (1. Need to narrow down our options from the large catalogues) and because they are (2) beneficial to both end-users and businesses.

For the end users,
Good recommender systems are like friends who know you better than you know yourself (:
They help users find items of interest, introduces new items to users [Novelty] and present unexpected items to users [Serendipity]. What an exciting adventure recommender systems can take us on!
For businesses, good recommendation systems (1) help item providers deliver their products to the intended audence and (2) improves customer satisfaction by (a) saving time spent on browsing/searching and (b) catering to different users' preferences.
Recommendation Systems operate in a virtuous cycle driven by user iteractions. (i.e. relies on the Network Effect)
More interactions lead to better awareness of user preferences which in turn lead to better recommendations made by the system. And what do better recommendations mean? More user interactions! And the virtuous cycle continues :)
The inputs to a recommendation system are: User feedback -- either explicity or implicit.
Explicit user feedback typically takes the form of reviews or ratings for items, while implicit user feedback manifests in information collected in the background such as user clicked/watched/viewed/purchased information.
With explicit feedback, Rating Prediction can be carried out. The rating of items for each user can be estimated.
There are a set of observed entries, and a set of missing entries are to be computed (recovered)
Evaluation: compare predicted ratings against actual ratings (when they occur), RMSE, MSE (pointwise metrics)
With implicit feedback, top K recommendations can be provided. However, a noteworth caveat is that the observations or lack thereof do not necessarily reflect preferences.
Using a set of observed entries, the likelihood for missing entries are predicted.
Evaluation: Compare actual observations against the list of top-k recommended items. Classification metrics such as precision, recall are used if order doesn't matter.
Collaborative Filtering
Key Idea: Filtering (recommending a subset) is performed with the help of other similar users (in a collaborative manner).
Collaborative filtering is a popular technique in recommender systems that makes automatic predictions (filtering) about the preferences of a user by collecting preferences from many users (collaborating). The underlying idea is that users who have agreed in the past tend to agree again in the future. Collaborative filtering methods do not require explicit knowledge about the items themselves but rather focus on the user-item interaction patterns.
It is based on the 'wisdom of the crowd' and inherently assumes that users who shared similar preferences in the past will be likely to agree in future too.
There are two main types of collaborative filtering: user-based collaborative filtering and item-based collaborative filtering.
Pros and Cons of Collaborative Filtering
Pro:
it works for any kind of item; there is no feature selection needed
Cons:
CF is susceptible to the Cold Start Problem wherein a sufficient number of users is necesary in order for good matches (and hence recommendations) to be generated.
CF faces the Sparsity problem: user-ratings matrix is usually sparse so it is hard to find users who have rated the same items. (We can't expect every user to have read every single book listed on the platform, though I found a user who has read 3000+books. Wow, what an amazing feat!)
CF faces the First Rater issue: items which have not been previously rated (i.e. new/esoteric items) cannot be recommended
Popularity bias: it may be difficult to generate good recommendations for someone with a unique taste; there is a tendency to recommend popular items.
A simple comparison between User-based CF and Item Based CF:

User-based Collaborative Filtering
User-based CF considers user X, then finds a set N of other users whose ratings are 'similar' to X's ratings. X's ratings (on items which X has not rated) are then estimated based on the ratings of users in N.
Hence, the key is in finding 'Similar' Users. And how do you do that? There are a couple of possible methods. Some commonly used ones are listed here: (1) using Jaccard Similarity Measure, (2) Cosine Similarity Measure, (3) Pearson Correlation Coefficient.
The formula for estimating user X's rating on item i via User-based CF is as follows;
(Note: sim(x,y) denotes the similarity between User X and User Y)

2. Item-based Collaborative Filtering
Item-based CF aims to find similar items to an (unrated) item i, and estimates the rating for item i based on the ratings for those similar items.
the same similarity metrics and prediction functions as in user-based CF can be used.
The formula for estimating user X's rating on item i via Item-based CF is as follows:
(Note: sim(i,j) denotes the similarity between Item i and item j)

3. Latent Factor Approach - Matrix Factorization
The Ratings matrix, R, has missing entries. The goal is to reconstruct matrix R (ratings matrix with no missing values; empty cells are filled in with predicted ratings) via matrices Q and P (transposed), with small reconstruction error, where Q is a matrix of dimensions number of items x number of factors, and P(transposed) is a matrix of dimensions number of factors by number of users.

The computation for (predicted) ratings matrix R is as follows:

The computation for each rating r(x,i) in the ratings matrix R is as follows:

A small reconstruction error means it is desirable to minimise the Sum of Squared Errors (SSE) for unseen test data.

Hence, a large k (number of factors) is desirable to capture all signals. However, SSE on test data begins to increase for k > 2, which results in overfitting. Hence, we need Regularization to allow a rich model where there is sufficient data, and shrink aggressively where data is scarce.


Further Exploration:
neural collaborative filtering
hybrid models: combination of memory-based & model-based
hybrid models: ensemble model to combine the predictions of individual models for a final prediction

Comments