Recommendation Systems

Image by Black Mirror

The recommendation problem from a bird’s eye view
Terminology briefly defined

Creativity is the power to connect the seemingly unconnected
-William Plomer

Importantly, knowledge in building machine learning algorithms to recommender systems is a skill often required for advanced data scientist.
Recommendation plays an increasingly important role in our daily lives. LinkedIn, Netflix, and Amazon all use recommender systems to find patterns of activities between different users and patterns of features between different items so as to produce recommendations to users. Not only giant companies, but almost every online business domain ranging from educational to healthcare exploits recommendation systems’ magic in order to advocate the user’s searching experience.

In this post, you’ll read about:

What is the ‘recommender’ problem?
What is content-based recommendation and collaborative filtering?
Low rank matrix factorization algorithm
What are some traditional and novel methods for recommender systems?
What is data scarcity and cold star issue?
How social networks’ data are combined into recommender systems?
What is Privacy Preserving Collaborative Filtering?

The problem that recommendation algorithms are trying to solve is to estimate a mathematical function which ranks alternatives according to their utility to an individual. This function may utilize data from:
i) users’ past behavior
ii) items similarity
iii)context
iv) relations to other users
and should predict how a user will be interested in a given information item.

Recommendation algorithms exploit predictive analysis on a user-item matrix so as to find items that users have not come across but may be interested to. Both users and items have attributes and the more you know about them, the better recommendations results can be expected. Usually at first, while you know little about your users’ tastes, you might base the recommendations on solely item attributes calculations (content-based approach). Though, user attributes (collaborative filtering) offer more accurate insights. These unknown attributes (like Elliot’s preference on Hemingway) are called latent due to the fact they are inferred from other variables rather than observed.

Existing recommender schemes can be divided into three categories:

Content-Based (CB),
1.1. Nearest-Neighbors
1.2 Matrix Factorization
1.3 Restricted Boltzmann Machines
1.4 Clustering and LSH
1.5 Association Rules
Collaborative Filtering (CF),and
2.1. TF-IDF
2.2. Classifiers
Hybrid approaches.

Content-Based approach exploits properties of an item on user past preferences while the second leverages properties of a user, namely an explicit and/or implicit feedback. As explicit imagine the user assigning a rating to an item whereas as implicit imagine the user clicking on a link.
There are also variations inside CB and CF approaches. For instance, community-based system or social-recommendation system is a variation of CF that follows the epigram “Tell me who your friends are, and I will tell you who you are”. It recommends items based on the preferences of the user’s friends.

Matrix factorization models is a CF technique that maps both users and items to a joint latent factor space, such that user-item interactions are modelled as inner products¹. We will see it without regularization in order to keep things simple, but in a real-life problem regularization terms are an integral part of the model.

Let’s say we have the following variables:

b: the total number of books
u: the total number of users
R(i,j): matrix cells filled with ones, if if a user j gave a rating to book i, and with zeros otherwise.
Y(i,j): a matrix filled with the rating (usually a range:0-5) that user j gave to book i. 0 stands for the absence of rating
X(i): a vector filled with nummerical values representing the volume of features of each book.
θ(j): a vector filled with nummerical values representing the volume of each users’ preferences on the books’ features.
A feature could be sci-fi and it could be equal to 0.9 for the book ‘The Man in the High Castle, (1963), whereas for the book ‘The Rosie Project’ (2013) could be equal to -0.4.

Our goal is to predict the rating that the user Elliot Alderson would give to the book ‘The Man in the High Castle’, according to the formula:

1	predictions or utility function = X*θ'

(θ’):Transpose θ-> It reverses the columns of θ into rows, and the rows of θ into columns.

The problem is formulated as:
‘Find me the values of the parameters (aka Gradient Descents) so that the average difference between my training and predicted values to be minimized (aka mean squared error)’.
A different way to say that is: find out the values of parameters of Θ and X that optimize the mean squared error.

error = (predicted-actual).* R 
J = 1/2 * sumsq(error(:))
X_grad = error* θ
Theta_grad = error'* X
(:) stands for all the entries of the matrix
.* stands for element wise multiplication
sumsq(X) stands for the sum of squares of elements of X matrix

The syntax of the equations are applied to Matlab and Octave programs.
The non-zeros values of Y will be the training data that would help us to infer its zeros valuesnamely our latent variables.

There are many issues faced by current recommender systems. Data scarcity and the cold star problem are tho very popular which become even more noticeable in the context of big data sources like social networks.
Data scarcity is about the limited number of available data that causes the high sparsity of a rating matrix. Matrix factorization techniques found useful into dealing with this.
The cold star problem pertains to the initial membership of a user where there is no historical data about their interests. The cold star problem is the worst nightmare of any recommender system researcher. A solution would be to to elicit users’ basic background via questions proposed by a decision tree. A more promising aproach is to use cross-domain recommender systems and social-networked data. The latter let the system follow traits of the user personality and the network they belong.

Novel methods include:

Context-aware Recommendations
1.1. Tensor Factorization
1.2. Factorization Machines
Deep Learning
Trust-based Recommendations
Social-based Recommendations
Learning to Rank

Social and trust-based recommendation are methods found on social recommender systems which are capable of dealing with the problem of data scarcity. Using as additional input, data from online social networks (OSN) permits new forms of rating items (‘Like’) or trustiness (‘an experienced friend from the social circle’) and provides user information both at individual and social level.
At social level, belongs friendship relationships, user influence and the number of common friends. These can be used to calculate similarities between user profiles to identify users that have relevant interests.
At individual level, belong data susch as user generated tags and link clicking.

Social recommender systems answer the following question “What is recommended for a user in relation to the network they belong”. Though, the assumption that the user and his/her friends have the similar tastes is not a reliable one. A matrix factorization method combined with social and trust data from OSN’s is proposed in the paper ‘Recommender systems based on social networks’. The authors cluster the similar users to calculate the similarity between users based on the correlation between a user and an item. The purpose of clustering is to identify the most suitable friends for realistic recommendation tasks. In this way, the recommendations made with regard to two facts: (i) users having the same or similar tastes; and (ii) users having expertise in some field.

However, collecting user interaction data to enhance recommendation accuracy is susceptible to many privacy issues. Hence,another restriction of recommender systems, is that they are susceptible to privacy attacks and violation of sensitive information of users.

Privacy-Preserving Collaborative Filtering (PPCF) in social recommender systems is an interesting research direction, since not only privacy is an essential aspect of data but also conventional PPCF techniques of computation-intensive cryptography or data perturbation techniques are not appropriate in real online services. Zhu et.al(2014) proposed an algorithm for neighbor based PPCF to protect neighbors and individuals’ ratings while Li et.al (2016) presented an algorithm for item based PPCF to protect individual privacy during recommendation.

¹ Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE
Computer, 42(8):30–37, 2009.

² Z. Sun, L. Han, W. Huang, X. Wang, X. Zeng, M. Wang, and H. Yan, “Recommender systems based on social networks,” J. Syst. Softw., vol. 99, pp. 109–119, 2015.

³ T. Zhu, Y. Ren, W. Zhou, J. Rong, and P. Xiong. An effective privacy preserving algorithm
for neighborhood-based collaborative filtering. Future Generation Computer Systems,
36:142–155, 2014.

⁴ Li, C. Chen, Q. Lv, L. Shang, Y. Zhao, T. Lu, and N. Gu, “An algorithm for efficient privacy-preserving item-based collaborative filtering,” Futur. Gener. Comput. Syst., vol. 55, pp. 311–320, 2016.

Thank You for Reading & Remain Creative

Recommendation Systems

Kommentare: