Collaborative Deep Learning for

2017-10-29  本文已影响0人  lecea丽

Collaborative Deep Learning for Recommender Systems

Authors:Hao Wang,Naiyan Wang,Dit-Yan Yeung

ABSTRACT

Categories and Subject Descriptors:
[Information Systems]: Models and Principles| General;
[Computer Applications]: Social and Behavioral Sciences
Keywords:
Recommender systems; Deep learning; Topic model; Text mining

1. INTRODUCTION

Due to the abundance of choice in many online services, recommender systems (RS) now play an increasingly significant role .
Existing methods for RS can roughly be categorized into three classes:

Because of CF-based methods prediction accuracy often drops significantly when ++the ratings are
very sparse++. Moreover, ++they cannot be used for recommending new products++ which have yet to receive rating information from users. Consequently, it is inevitable for CF-based methods to exploit auxiliary information and hence hybrid methods have gained popularity in recent years.

According to whether two-way interaction exists between ++the rating information++ and ++auxiliary information++, hybrid methods into two sub-categories:

With two-way interaction, tightly coupled methods can automatically learn features from the auxiliary information and naturally balance the influence of the rating and auxiliary information.

目前最好的方法,也是本文提出来的方法 collaborative deep learning (CDL)的基础:Collaborative topic regression (CTR) is a probabilistic graphical model that seamlessly integrates a topic model, latent Dirichlet allocation (LDA) , and a model-based CF method, probabilistic matrix factorization (PMF).

目的:This calls for integrating deep learning with CF by performing deep learning collaboratively.

deep learning models for CF(综述):

To address the challenges above, we develop a hierarchical Bayesian model called ==collaborative deep learning (CDL)== as a novel tightly coupled method for RS.

Experiments show that CDL significantly outperforms the state of the art.

(Note: Although we present CDL as using SDAE for its feature learning component, CDL is actually a more general framework which can also admit other deep learning models such as deep Boltzmann machines, recurrent neural networks , and convolutional neural networks.)

==The main contribution:==

2. NOTATION AND PROBLEM FORMULATION

Defination:

Given part of the ratings in R and the content information $X_c$, the problem is to predict the other ratings in R.

(Note : an L=2-layer SDAE corresponds to an L-layer network.)

3. COLLABORATIVE DEEP LEARNING

3.1 Stacked Denoising Autoencoders

SDAE is a ++feedforward neural network++ for learning representations (encoding) of the input data by learning to predict the clean input itself in the output.

SDAE 是一种++前馈神经网络++,用于通过学习预测输出中的干净输入本身来学习输入数据的表示(编码),如图2所示。

QQ图片20171026194440.png QQ图片20171026194929.png

3.2 Generalized Bayesian SDAE

QQ图片20171026202817.png

(Note: If λs goes to infinity, the Gaussian distribution in Equation (1) will become a ++Dirac delta distribution++. The model will degenerate to be a ++Bayesian formulation of SDAE++. )

(Note: the first L=2 layers of the network act as an encoder and the last L=2 layers act as a decoder.)

3.3 Collaborative Deep Learning

QQ图片20171026203337.png

(Note: ++the middle layer++ XL=2 serves as a bridge between the ratings and content information. This middle layer, along with the latent offset �j, is the key that ++enables CDL to simultaneously learn an effective feature representation and capture the similarity and (implicit) relationship between items (and users)++. )

The graphical model of CDL when λs approaches positive infinity :

QQ图片20171026204920.png

3.4 Maximum A Posteriori Estimates

An EM-style algorithm for obtaining the MAP estimates:

QQ图片20171027162117.png QQ图片20171027162738.png QQ图片20171027162911.png

3.5 Prediction

Let D be the observed test data. We use the point estimates of ui, W+ and �j to calculate the predicted rating:

QQ图片20171027163053.png

we approximate the predicted rating as:

QQ图片20171027163452.png

4. EXPERIMENTS

4.1 Datasets

( Note: After removing stop words, the top S discriminative words according to the ++tf-idf values++ are chosen to form the vocabulary (S is 8000, 20000, and 20000 for the three datasets).)

4.2 Evaluation Scheme

We use recall as the performance measure because the rating information is in the form of implicit feedback.

QQ图片20171027164228.png

Another evaluation metric is the mean average precision (mAP).

4.3 Baselines and Experimental Settings

4.4 Quantitative Comparison

QQ图片20171027165319.png QQ图片20171027165555.png QQ图片20171027165622.png QQ图片20171027165650.png QQ图片20171027165729.png

4.5 Qualitative Comparison

With a more effective representation, CDL can capture the key points of articles and the user preferences more accurately. Besides, it can model the co-occurrence and relations of words better.

CDL is sensitive enough to changes of user taste and hence can provide more accurate recommendation.

5. COMPLEXITY ANALYSIS AND IMPLEMENTATION

the total time complexity is O(JSK1 + K2J 2 + K2I2 + K3).

CDL is very scalable.

6. CONCLUSION AND FUTURE WORK

刘丽
2017-10-26

上一篇下一篇

猜你喜欢

热点阅读