A Correlative Denoising Autoencoder to Model Social Influence for Top-N   Recommender System

Yiteng Pan; Fazhi He; Haiping Yu

arXiv:1703.01760·cs.IR·December 6, 2019

A Correlative Denoising Autoencoder to Model Social Influence for Top-N Recommender System

Yiteng Pan, Fazhi He, Haiping Yu

PDF

TL;DR

This paper introduces CoDAE, a novel autoencoder-based model that captures social influence and user role correlations to improve Top-N recommendations in sparse social and rating data environments.

Contribution

The paper proposes a correlative denoising autoencoder that models user roles and their correlations, enhancing recommendation accuracy with sparse data.

Findings

01

Outperforms state-of-the-art algorithms on MAP and NDCG metrics

02

Effectively models social influence and user roles in sparse data settings

03

Demonstrates robustness in learning from limited social and rating data

Abstract

In recent years, there are numerous works been proposed to leverage the techniques of deep learning to improve social-aware recommendation performance. In most cases, it requires a larger number of data to train a robust deep learning model, which contains a lot of parameters to fit training data. However, both data of user ratings and social networks are facing critical sparse problem, which makes it not easy to train a robust deep neural network model. Towards this problem, we propose a novel Correlative Denoising Autoencoder (CoDAE) method by taking correlations between users with multiple roles into account to learn robust representations from sparse inputs of ratings and social networks for recommendation. We develop the CoDAE model by utilizing three separated autoencoders to learn user features with roles of rater, truster and trustee, respectively. Especially, on account of that…

Tables5

Table 1. Table 1: Statistics of Epinions and Ciao

Dataset	Ciao	Epinions \bigstrut
Number of users	7,375	22,166 \bigstrut
Number of items	106,797	296,277 \bigstrut
Number of ratings	284,086	922,267 \bigstrut
Number of social links	111,781	300,548 \bigstrut
Rating sparsity	0.036%	0.014% \bigstrut
Social sparsity	0.205%	0.061% \bigstrut

Table 2. Table 2: Statistics of the processed datasets

Dataset	Ciao	Epinions \bigstrut
Number of users	5,072	18,096 \bigstrut
Number of items	8,155	22,386 \bigstrut
Number of ratings	101,116	355,132 \bigstrut
Number of social links	85,916	252,101 \bigstrut
Rating sparsity	0.244%	0.088% \bigstrut
Social sparsity	0.334%	0.077% \bigstrut

Table 3. Table 3: Comparison results on all users

Dataset	Metrics	Pop	BPR	GBPR	SBPR	CDAE	TDAE	CoDAE	Improve \bigstrut
Ciao	MAP@10	0.0210	0.0199	0.0229	0.0219	0.0289	0.0299	0.0306	2.34% \bigstrut
k=10	NDCG@10	0.0369	0.0357	0.0408	0.0402	0.0504	0.0521	0.0528	1.34% \bigstrut
Ciao	MAP@10	0.0210	0.0201	0.0234	0.0240	0.0284	0.0303	0.0311	2.64% \bigstrut
k=20	NDCG@10	0.0369	0.0360	0.0419	0.0435	0.0500	0.0533	0.0541	1.50% \bigstrut
Ciao	MAP@10	0.0210	0.0232	0.0247	0.0255	0.0291	0.0320	0.0329	2.81% \bigstrut
k=100	NDCG@10	0.0369	0.0424	0.0437	0.0456	0.0510	0.0549	0.0561	2.19% \bigstrut
Epinions	MAP@10	0.0080	0.0100	0.0115	0.0072	0.0143	0.0144	0.0145	0.69% \bigstrut
k=10	NDCG@10	0.0153	0.0194	0.0218	0.0144	0.0265	0.0271	0.0272	0.37% \bigstrut
Epinions	MAP@10	0.0080	0.0127	0.0126	0.0090	0.0142	0.0149	0.0151	1.34% \bigstrut
k=20	NDCG@10	0.0153	0.0237	0.0237	0.0173	0.0266	0.0277	0.0280	1.08% \bigstrut
Epinions	MAP@10	0.0080	0.0163	0.0082	0.0161	0.0164	0.0168	0.0171	1.79% \bigstrut
k=100	NDCG@10	0.0153	0.0291	0.0158	0.0301	0.0304	0.0312	0.0316	1.28% \bigstrut

Table 4. Table 4: Impact of related regularization term β 𝛽 \beta

$β$	0	0.001	0.005	0.01	0.05	0.1	0.2 \bigstrut
Ciao	0.0461	0.0464	0.0500	0.0514	0.0523	0.0547	0.0530 \bigstrut
Epinions	0.0309	0.0312	0.0313	0.0316	0.0299	0.0293	0.0276 \bigstrut

Table 5. Table 5: Impact of dimension k 𝑘 k

$k$	5	10	20	50	100	200
Ciao	0.0505	0.0523	0.0528	0.0529	0.0547	0.0526
Epinions	0.0235	0.0272	0.0278	0.0294	0.0317	0.0313

Equations36

P (\tilde{x}_{d} = δ x_{d})

P (\tilde{x}_{d} = δ x_{d})

P (\tilde{x}_{d} = 0)

H_{u} = g (W^{T} \tilde{R}_{u} + V_{u} + b)

H_{u} = g (W^{T} \tilde{R}_{u} + V_{u} + b)

\hat{R}_{u}

\hat{R}_{u}

= f (W^{' T} (g (W^{T} \tilde{R}_{u} + V_{u} + b)) + b^{'})

\frac{1}{2} u = 1 \sum n l (\hat{R}_{u}, R_{u}) + \frac{λ}{2} (∣∣ W ∣ ∣_{F}^{2} + ∣∣ b ∣ ∣_{F}^{2} + ∣∣ W^{'} ∣ ∣_{F}^{2} + ∣∣ b^{'} ∣ ∣_{F}^{2} + ∣∣ V ∣ ∣_{F}^{2})

\frac{1}{2} u = 1 \sum n l (\hat{R}_{u}, R_{u}) + \frac{λ}{2} (∣∣ W ∣ ∣_{F}^{2} + ∣∣ b ∣ ∣_{F}^{2} + ∣∣ W^{'} ∣ ∣_{F}^{2} + ∣∣ b^{'} ∣ ∣_{F}^{2} + ∣∣ V ∣ ∣_{F}^{2})

l (y, \hat{y}) = i \in K \cup S (t) \sum - y_{i} l o g (\hat{y}_{i}) - (1 - y_{i}) l o g (1 - \hat{y}_{i})

l (y, \hat{y}) = i \in K \cup S (t) \sum - y_{i} l o g (\hat{y}_{i}) - (1 - y_{i}) l o g (1 - \hat{y}_{i})

H_{r, u}

H_{r, u}

H_{t, u}

H_{e, u}

\hat{R}_{u}

\hat{R}_{u}

\hat{T}_{u}

\hat{E}_{u}

R e l = \frac{β}{2} u = 1 \sum n (∣∣ H_{r, u} - H_{t, u} ∣ ∣_{F}^{2} + ∣∣ H_{r, u} - H_{e, u} ∣ ∣_{F}^{2} + ∣∣ H_{t, u} - H_{e, u} ∣ ∣_{F}^{2})

R e l = \frac{β}{2} u = 1 \sum n (∣∣ H_{r, u} - H_{t, u} ∣ ∣_{F}^{2} + ∣∣ H_{r, u} - H_{e, u} ∣ ∣_{F}^{2} + ∣∣ H_{t, u} - H_{e, u} ∣ ∣_{F}^{2})

L

L

+ \frac{β}{2} (∣∣ H_{r} - H_{t} ∣ ∣_{F}^{2} + ∣∣ H_{r} - H_{e} ∣ ∣_{F}^{2} + ∣∣ H_{t} - H_{e} ∣ ∣_{F}^{2})

+ \frac{λ}{2} n \in {r, t, e} l \in {0, 1} \sum (∣∣ W_{n, l} ∣ ∣_{F}^{2} + ∣∣ b_{n, l} ∣ ∣_{F}^{2}) + \frac{λ}{2} u = 1 \sum n (∣∣ V_{u} ∣ ∣_{F}^{2} + ∣∣ M_{u} ∣ ∣_{F}^{2})

\hat{R}_{u} = f (W_{r, 1}^{T} g (W_{r, 0}^{T} R_{u} + V_{u} + M_{u} + b_{r, 0}) + b_{r, 1})

\hat{R}_{u} = f (W_{r, 1}^{T} g (W_{r, 0}^{T} R_{u} + V_{u} + M_{u} + b_{r, 0}) + b_{r, 1})

θ^{t + 1} = θ^{t} - α g_{θ}^{t}

θ^{t + 1} = θ^{t} - α g_{θ}^{t}

P @ N

P @ N

A P @ N

A P @ N

N D C G @ N

N D C G @ N

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDenoising Autoencoder · Solana Customer Service Number +1-833-534-1729

Full text

\volumn\copynote

A Correlative Denoising Autoencoder to Model Social Influence for Top-N Recommender System

Yiteng Pan1

Fazhi He 1 and Haiping Yu 1

1 School of Computer Science, Wuhan University, Wuhan 430072, China

Abstract

In recent years, there are numerous works been proposed to leverage the techniques of deep learning to improve social-aware recommendation performance. In most cases, it requires a larger number of data to train a robust deep learning model, which contains a lot of parameters to fit training data. However, both data of user ratings and social networks are facing critical sparse problem, which makes it not easy to train a robust deep neural network model. Towards this problem, we propose a novel Correlative Denoising Autoencoder (CoDAE) method by taking correlations between users with multiple roles into account to learn robust representations from sparse inputs of ratings and social networks for recommendation. We develop the CoDAE model by utilizing three separated autoencoders to learn user features with roles of rater, truster and trustee, respectively. Especially, on account of that each input unit of user vectors with roles of truster and trustee is corresponding to a particular user, we propose to utilize shared parameters to learn common information of the units that corresponding to same users. Moreover, we propose a related regularization term to learn correlations between user features that learnt by the three subnetworks of CoDAE model. We further conduct a series of experiments to evaluate the proposed method on two public datasets for Top-N recommendation task. The experimental results demonstrate that the proposed model outperforms state-of-the-art algorithms on rank-sensitive metrics of MAP and NDCG.

doi:

††articletype: RESEARCH ARTICLE

\Keywords

Social Network, Recommender System, Denoising Autoencoder, Neural Network

1 Introduction

With the rapid development of social media in recent years, the massive amount data of social network offers researchers a great opportunity to study user behavior patterns and model user preferences for a variety of applications [1, 2]. These social-aware methods have been widely used for a lot of web services, such as Ciao.com, Epinions.com and Facebook.com. To address the sparse problem of recommender system, there are numerous social-aware recommendation algorithms been developed to help model user preferences by integrating social influence for users.

To model social influence from social network, there are numerous works been proposed to make use of social network to improve recommendation performance [3, 4, 5, 6, 7]. These methods can roughly classify these methods into two categories: single-role based methods [3, 4, 5] and multi-role based methods [8, 7]. The single-role based methods model each user with a single role for all user feedbacks, while multi-role methods model each user with different roles in different cases. Intuitively, the multi-role methods are closer to our real life since each user plays different roles in different situations. Experimental results in [7] and [8] also prove that multi-role based methods are more accurate than those single-role based methods [3, 4, 5]. However, the data of social network are facing critical sparse problem [9], which make it not easy to extract exact social information from these data.

In recent years, deep learning models have achieved a great success [10, 11, 12, 13, 14] besides classical methods [15, 16, 17, 18, 19, 20, 21, 22]. Those non-linear neural networks can automatically learn effective representations from huge amount of data and significantly improve prediction accuracy. In recent years, how to utilize deep learning model to improve recommendation performance has become a hot topic [23, 24, 25, 26, 27, 28]. To overcome the data spare problem for recommendation, most researchers propose to utilize the Denoising Autoencoder (DAE) [29] to learn compact representations from spare data for recommender systems. Some DAE-based methods [23, 27] focus on how to learn user preferences from users’ rating information by DAE network directly. Their experimental results demonstrate great improvement while compared to traditional linear models, such as matrix factorization method [30]. Meanwhile, some other works [31, 25, 26, 28] attempt to use DAE model to learn compact representations from auxiliary information to help improve CF-based method against data sparse problem, such as content [31], tags [25] or images [26].

To make use of social media, there are numbers of works been proposed to utilize the technique of neural network to improve the performance of social recommendation [28, 32, 33]. In [28], Deng et al. propose to learn user representations from ratings by Denoising Autoencoder. Then they propose a new collaborative filtering method by utilizing the learnt results to initialize the preferences of users and measure trust similarities. In [32], Guo et al. propose an embedding-based recommendation method to exploit the deep structure in social networks and rating patterns. In [33], Wang et al. propose to learn user representations by deep neural networks with inferred ratings and social networks. They further propose to inject original and inferred social information into the input and hidden layer of neural network to improve recommendation.

There is a key problem remained open for the task of utilizing deep learning techniques for social recommendation. Although deep neural networks show great learning power for many pattern recognition tasks, it requires a large amount of data for training to keep aware from overfitting in most cases [10]. However, for that both data of user ratings and social networks are facing critical sparse problem, it is not easy to learn robust representations by deep learning models. Therefore, it is necessary to develop a new model to mine more information from social networks by deep learning models for recommendation.

In this paper, we develop a novel deep learning model of Correlative Denoising Autoencoder (CoDAE) to learn user features and take the correlations among user features with multiple roles into account. First, we utilize three separated autoencoders to learn user features independently from heterogeneous information of users with multiple roles of rater, truster and trustee, respectively. Second, we propose to inject a shared parameter matrix into the model to learn common information of users with multiple roles. Third, we propose a related regularization term to build relations for user features with different roles.

Especially, in the view of input layers, each input unit of users with roles of truster and trustee is corresponding to a particular user. Therefore, there may exist some implicit common features between the training parameters that corresponding to same users in input layers for roles of truster and trustee. In CoDAE model, we propose to utilize a shared weight matrix to learn implicit common features for users. Moreover, in the view of middle layers, the output vectors of each user with roles of truster and trustee reflect user’s preferences in different perspectives. So that these output vectors for a particular user are similar with each other. This motivates us to propose the related regularization term to build relations among user vectors. With these two methods to mine correlations for each user with multiple roles, the CoDAE model is formulated to simultaneously learn user features from both user ratings and social network by neural networks.

The main contributions in this paper are summarized as follows:

•

To learn user features from ratings and social network with different distributions, we propose a novel Correlative Denoising Autoencoder (CoDAE) model by utilizing three separate autoencoders to model users with roles of rater, truster and trustee, respectively.

•

To address the sparse problem of social network, we propose to utilize a shared weight matrix to learn implicit common information between input units that corresponding to same users.

•

To simultaneously exchange information between user features with multiple roles, we further propose a related regularization term to build relations between hidden layers of the three separated autoencoders.

•

We conduct comprehensive experiments to evaluate the proposed method CoDAE for Top-N recommendation task. Experimental results on two real-world public datasets demonstrate that the CoDAE model significantly outperforms other state-of-the-art algorithms.

2 Related Work

In this section, we discuss the relations of the proposed CoDAE model for recommendation in two perspectives: Trust-aware recommendation and Deep learning for recommendation.

2.1 Social recommendation

With the rapid development of social media, there are numerous works been proposed to leverage social influence to improve web applications [4, 34]. This raises the problem of that how to integrate social influence into recommender systems and attracts a lot of attention in recent years [3, 4, 5, 6, 7]. These social recommendation algorithms have been proved to be effective to address the data sparse problem of conventional CF-based methods.

Especially, Ma et al. [3] propose the SoRec model to integrate trust relationships by factorized the social matrix, which shared common latent user preferences with CF method. Jamali et al. [4] propose a novel SocialMF model with trust propagations between users. The basic idea is that a user’s taste is close to the average preference of his/her trust friends. Ma et al. [5] further propose to take trust strength into account, and then build the SoReg model by incorporating individual social regularization.

The above algorithms model users with a common shared space for rating data and trust relationships. These single-role based methods do not consider that each user plays different roles in different situations. Therefore, some multi-role methods are proposed to address this problem. For example, Yang et al. [6] propose the TrustMF method to model users with roles of truster and trustee. Then they propose to make predictions by integrating these two kinds of information. In [7], Yao et al. further propose to take implicit correlation between users whole are similar but not socially connected by modeling users with multi-role.

However, the trust relationships in social network are also facing data sparse problem in most cases [9], which may limit the performance of social-aware applications. To overcome this problem for recommender system, we propose to utilize the recent deep learning techniques to learn social influence for each user through denoising autoencoder network, which is quite suitable for sparse data [29].

2.2 Deep learning for recommendation

Deep neural network has already been proved to be a powerful learning model for many pattern recognition tasks, such as image recognition [11], object detection or neural machine translation. Due to the great learning ability in various domains, it attracts a lot of attention to utilize deep learning techniques for recommendation in recent years.

Especially, the AutoRec [23] model is proposed by utilizing Denoising Autoencoder to direct learn user feature from rating data. Their results demonstrate that using deep learning techniques has great potential to improve recommendation performance. In [27], Wu et al. propose to inject user-special vectors in hidden layer of Autoencoder network to exact learn user preferences for top-n recommendation. Obviously, these methods are plagued by sparse problem of rating data, which have an adverse impact on recommendation performance. In [35], the ECAE model is proposed to learn implicit information from generated labels of users. Their experimental results demonstrate that this is a potential approach to address sparse problem for recommendation.

Incorporating auxiliary information with rating data is one of the most effective ways to address data sparse problem [31, 26]. In [31], Wang et al. propose the CDL model, which utilizes Denoising Autoencoder to learn compact representations from content information. These representations are then tightly integrated with matrix factorization to model user preferences for recommendation. The VBPR model [26] make use of visual features learnt by Convolutional Neural Network (CNN) to help improve Bayesian Personalized Ranking (BPR) [36] model for top-n recommendation. Their theoretical and experimental results demonstrate that it is a valuable idea to introduce auxiliary information for deep learning based recommendation.

However, most existing methods focus on learning compact representations from only one kind of data, i.e., rating data or auxiliary information. As a matter of fact, using neural networks for either kind of data can effectively improve the recommendation performance. It raises an interesting question: how to utilize neural networks to learn representations for multiple kinds of data and fuse them into a unified framework for recommendation?

Towards this problem, we propose the CoDAE method to learn user features from data of ratings and social network by deep neural networks. Especially, to overcome the data sparse problem of ratings and social network, we propose to build relations between user features with different perspectives in two aspects: training parameters for input layer and output units of middle layer. In this way, we obtain a robust social recommendation model based on deep learning techniques.

2.3 Deep learning for social recommendation

Motivated by the success of deep learning, there are numerous works been proposed to leverage deep learning techniques to improve social recommendation performance [28, 32, 33]. Especially, Deng et al. propose the DLMF model [28] to employ Denoising Autoencoder to initialize latent user features, which used in matrix factorization model. In [32], Guo et al. propose an embedding-based method to learn user representations in social networks and rating patterns for recommendation. In [33], Wang et al. propose to learn user features by making use of inferred ratings and social networks for deep neural network.

However, since the data of social network are quite sparse, it is not easy to learn robust representations from social network by deep neural networks, which require a larger amount of data to train a robust model in most cases. Therefore, it is necessary to mine more information from social network to learn robust user features.

In this paper, we propose a novel structure of deep neural network to learn user features from ratings and social networks. Especially, we propose to mine correlations between user features by shared parameters and related regularization. Experimental results demonstrate that these two methods are quite effective to improve recommendation.

2.4 Heterogeneous network based recommendation

In [37], Zhang et al. propose a new Joint Representation Learning (JRL) framework to learn user representations from multiple heterogeneous data of reviews, images and ratings. They propose to learn representations from difference sources by separated neural networks. These representations are then jointly integrated to represent users and used to train for top-n recommendation. To make this framework flexible to be easily extended for new information sources and avoid re-training in practice, this model doesn’t take the advantage of the power of multi-view machine learning.

However, as a matter of fact, the data of social network are very sparse in most case. This problem makes it hard to learn robust representations from social network by a neural network, which requires a larger amount of data to prevent overfitting. Therefore, it is necessary to take the correlations among these representations to improve the accuracy and performance.

In this paper, we propose a novel Correlative Denoising Autoencoder (CoDAE) model to learn robust user features from multiple heterogeneous data of ratings and social network. To achieve an unbiased and efficient prediction function, we utilize three autoenders to learn user features by modeling users with roles of rater, truster and trustee. Then we propose to build relations between these subnetworks in two aspects. First, we utilize a shared training weight matrix in them to automatically learn implicit common information for users. Second, we propose a related regularization term to bridge relations between hidden representations of these subnetworks.

3 Proposed method

In this section, we introduce the proposed Correlative Denoising Autoencoder (CoDAE), which learns compact and robust representations from rating and trust data.

3.1 Problem Description

With a set of users $\mathcal{U}=\{1,...,n\}$ and a set of items $\mathcal{I}=\{1,...,m\}$ , the task of top-n recommendation is to recommend a list of N items for each user to meet his/her need. In this model, we have three matrices as input data: rating matrix $\textbf{R}\in\mathbb{R}^{n\times m}$ , truster matrix $\textbf{T}\in\mathbb{R}^{n\times n}$ and trustee matrix $\textbf{E}\in\mathbb{R}^{n\times n}$ which is the transposed matrix of T. Since both rating and trust data are very sparse, most entries of these input matrices are unobserved and treated as zeros. In particular, we use $\textbf{R}_{u}$ , $\textbf{T}_{u}$ and $\textbf{E}_{u}$ to represent the input vectors of user $u$ with roles of rater, truster and trustee, respectively. Especially, to model personal interests of users, we use $\textbf{V}_{u}\in\mathbb{R}^{k}$ to represent the user-specific vector of user $u$ .

3.2 Collaborative Denoising Autoencoder

In Collaborative Denoising Autoencoder (CDAE) model [27], the $d^{th}$ entry $\tilde{x}_{d}$ of input vector x is randomly overwritten by zero with a probability of $q$ as following:

[TABLE]

Where $\delta=1/(1-q)$ is used to make the corruption unbiased.

As demonstrate in Figure 1, the CDAE model with one hidden layer is implemented by first mapping the corrupted user feedbacks $\tilde{R}_{u}$ into a low-dimensional space through a linear mapping function and then injecting user-specific vectors $\textbf{V}_{u}$ into this hidden layer by:

[TABLE]

Where $\textbf{W}\in\mathbb{R}^{m\times k}$ is a weight matrix and $\textbf{b}\in\mathbb{R}^{k}$ is an offset vector. $g(\cdot)$ is an active function. Note that the dimension of hidden layer is much smaller than that of input layer, i.e., $k\ll D$ .

The user preference $\textbf{H}_{u}$ is then mapped back to reconstruct the inputs and predict unobserved entries through another mapping layer by:

[TABLE]

Where $\textbf{W}^{\prime}\in\mathbb{R}^{k\times m}$ and $\textbf{b}\in\mathbb{R}^{m}$ are the corresponding weight matrix and offset vector to reconstruct inputs through activation function $f(\cdot)$ .

Finally, the objective function of CDAE is formulated by:

[TABLE]

Where $\mathit{l}(\cdot)$ is a loss function to evaluate the reconstruction error for each input vector; $\lambda$ is a hyper-parameter to control the model complexity to prevent overfitting; $||\cdot||_{F}$ denotes the frobenius norm for regularization term.

To achieve the best performance, we choose the logistic loss as loss function for top-n recommendation task in this paper according to the experimental results in [27], which is defined by:

[TABLE]

Where $\mathcal{K}$ denotes the observed entries of vector y and $\mathcal{S}(t)$ denotes the entries that sampled from the unobserved entries of vector y, the number of which is $t$ times of the number of observed entries.

3.3 Correlative Denoising Autoencoder

In recent years, there are a lot of methods been proposed to make use of deep learning techniques to improve social recommendation [28, 32, 33]. However, for that there are a lot of parameters to be trained in deep neural networks, it is very easy to fall into overfitting by feeding sparse inputs of user ratings and social networks for these models. Therefore, it is necessary to mine more useful information from these sparse inputs to train deep learning based recommendation models. Towards this problem, we propose a novel deep learning model of Correlative Denoising Autoencoder (CoDAE) to learn user features and take the correlations among user features with multiple roles into account for social recommendation.

As demonstrated in Figure 2, to ensure all the information make equal contributions on the prediction results, we utilize three separate subnetworks to independently extract compact representations for each user with roles of rater, truster and trustee. The user nodes for personal biases are then injected into the hidden layer of the subnetwork for raters as that do in CDAE [27]. Moreover, towards the data sparse problem, we propose to build relations among user features with multiple roles in two aspects: training parameters of input layer and output vectors of middle layer.

Since each unit of the input vectors $\textbf{T}_{u},\textbf{E}_{u}$ and $\textbf{V}_{u}$ is associated with a particular user, we argue that there exists some common information between the training parameters for the input units that associated with the same users. To automatically learn the implicit common information, we denote a shared weight matrix $\textbf{M}\in\mathbb{R}^{n\times k}$ to represent the common features for these three inputs. In particular, we utilize $\textbf{M}_{u}$ to represent $u^{th}$ vector of M, which is corresponding to the shared common preference of user $u$ .

With the corrupted inputs $\tilde{\textbf{T}}$ , $\tilde{\textbf{E}}$ and $\tilde{\textbf{R}}$ , which are generated from T, E and R through the distribution of Equation 1, the hidden representations $\textbf{H}_{r,u}\in\mathbb{R}^{k}$ , $\textbf{H}_{t,u}\in\mathbb{R}^{k}$ and $\textbf{H}_{e,u}\in\mathbb{R}^{k}$ of user $u$ with roles of rater, truster and trustee are respectively computed by the following mapping functions:

[TABLE]

Where $\alpha$ is the hyper-parameter that used to balance the influence of shared weight matrix M. The parameters of $\{\textbf{W}_{r,0}\in\mathbb{R}^{m\times k},\textbf{b}_{r,0}\in\mathbb{R}^{k}\}$ , $\{\textbf{W}_{t,0}\in\mathbb{R}^{n\times k},\textbf{b}_{t,0}\in\mathbb{R}^{k}\}$ and $\{\textbf{W}_{e,0}\in\mathbb{R}^{n\times k},\textbf{b}_{e,0}\in\mathbb{R}^{k}\}$ are the weight matrices and offset vectors to map user preferences into a low-dimensional space through active function $g(\cdot)$ .

The hidden representations $\textbf{H}_{r,u}$ , $\textbf{H}_{t,u}$ and $\textbf{H}_{e,u}$ of user $u$ are then mapped back to predict the clean inputs by:

[TABLE]

Where $\{\textbf{W}_{r,1}\in\mathbb{R}^{m\times k},\textbf{b}_{r,1}\in\mathbb{R}^{k}\}$ , $\{\textbf{W}_{t,1}\in\mathbb{R}^{n\times k},\textbf{b}_{t,1}\in\mathbb{R}^{k}\}$ and $\{\textbf{W}_{e,1}\in\mathbb{R}^{n\times k},\textbf{b}_{e,1}\in\mathbb{R}^{k}\}$ are the weight matrices and offset vectors of the decoder activation function $f(\cdot)$ for the three subnetworks in Figure 2.

Although users show different characters when they are playing different roles, the key personalities of each user will not change in most cases. In other words, although the representations for each user with multiple roles are different, there are something in common among these representations for each user. For example, as demonstrated in Figure 3, user $a$ and $b$ are not similar in Figure 3(a), but they are similar as trusters in Figure 3(b). This may indicate that user preferences of a and b are similar to a certain degree in view of raters for that there are too few observed common ratings to exact model their preferences. Therefore, it is necessary to build relations between these representations with multiple roles to exchange information for sparse inputs of user ratings and social networks.

Especially, we propose a novel regularization term to build relations between user features with multiple roles by minimizing the distances among output vectors of middle layers in CoDAE network as follows:

[TABLE]

Where $\beta$ is the hyper-parameter to control the importance of related regularization term. A larger value of $\beta$ indicates that users share more common features with multiple roles. If $\beta\rightarrow\infty$ , the user representations of $\textbf{H}_{r,u}$ , $\textbf{H}_{t,u}$ and $\textbf{H}_{e,u}$ will be optimized to almost equal values by minimizing this related regularization term. If $\beta=0$ , this regularization makes no impact on the results of CoDAE model. For that each user show different but similar preferences when playing different roles in real life, the best value of $\beta$ should be determined by experimental results.

Formally, we learn the parameters of CoDAE model by minimize the following objective function:

[TABLE]

Where $\lambda$ is the corresponding hyper-parameter to control the model complexity to prevent overfitting;

Finally, after training the CoDAE model by minimizing loss function of Equation (9), the unobserved entries of $\textbf{R}_{u}$ is predicted by:

[TABLE]

Where $\hat{\textbf{R}}_{u}$ is the predicted vector of user $u$ . Each $i^{th}$ unit of $\hat{\textbf{R}}_{u}$ stands for the predicted score of user $u$ for item $i$ . The recommendation item list for user $u$ is then generated by selecting N items with highest scores from unobserved entries of ${\textbf{R}}_{u}$ .

3.4 Model learning

The CoDAE model is implemented based on the powerfull deep learning framework of PyTorch, which can automatically compute gradient with strong GPU acceleration. We apply the widely used stochastic gradient descent (SGD) method to learn parameters for CoDAE model. During each iteration, the parameters in CoDAE model are updated as following:

[TABLE]

Where $\theta^{t}$ represents the value of parameters at iteration $t$ . $g_{\theta}^{t}$ indicate the corresponding gradients for $\theta^{t}$ . $\alpha$ is the learning rate during training process.

3.5 Model complexity analysis

As demonstrated in Figure 2, the CoDAE model is consist of three subnetworks to learn user features with multiple roles of rater, truster and trustee. For each subnetwork, the input data with dimensionality of $m$ is first mapped into a middle layer with dimensionality of $k$ and then mapped back to predict input data. For each iteration, since the CoDAE model is trained over all users, the whole model complexity of CoDAE is $O(nmk+nnk+nnk)=O(kn(m+n))$ .

In [27], Wu et al. propose a novel sampling strategy to reduce the model complexity of neural networks of autoencoder for sparse inputs. For that the CoDAE is also developed based on the structure of autoencoder, this sampling strategy can also be used for CoDAE. Especially, due to the sparse problem of user ratings and social networks, the number of unlabeled data are much larger than that of labeled data. It is unnecessary to compute gradients for all unlabeled data since they contain very little useful information. In this way, the training complexity of CoDAE is $O(kn(\bar{m}+\bar{n}+\bar{n}^{\prime}))$ , where $\bar{m}$ , $\bar{n}$ and $\bar{n}^{\prime}$ denote the mean numbers of positive labels of users with roles of rater, truster and trustee, respectively. For that the data of user ratings and social networks are very sparse, the values of $\bar{m}$ , $\bar{n}$ and $\bar{n}^{\prime}$ are much smaller than that of $m$ , $n$ and $n$ , respectively. Therefore, the ECAE model is also computable for big data applications.

4 Experiments and Results

In this section, we demonstrate comprehensive experimental results to evaluate the proposed CoDAE model for top-n recommendation task. Our experiments are designed to answer the following questions:

How does CoDAE model compare with other state-of-the-art related methods?

-

How does CoDAE model perform for sparse users?

-

How do the methods of shared parameters and related regularization impact the performance?

-

How does dropout probability impact on the performance?

-

How does CoDAE model perform with different dimensions?

4.1 Datasets

To compare the proposed CoDAE model with state-of-the-arts methods, we use two public real-world datasets for comparison: Ciao and Epinions datasets [38]. These two datasets contain both rating data and trust relationships, which are crawled from two famous e-commerce websites, i.e., the Ciao.com and Epinions.com. On these websites, users can rate each item with an integer number which range from $1$ to $5$ and build trust relationship with any other users to help making decisions. The trust relationships in social network are formulated in binary format, where $1$ for trust and [math] for unobserved relationship. The statistics of these two datasets are shown in Table 1.

The top-n recommendation task is to recommend a personalized list of items for each user to help finding what he/she wants. This task is widely adopted by many existing works [27]. Since this paper is mainly focus on the top-n recommendation task, we remove the rating scores that less than four stars and remain the others with score of one for all datasets [27]. We then iteratively drop users and items with less than 5 ratings. The statistics of processed datasets are listed in Table 2.

4.2 Evaluation Metrics

The task of top-n recommendation is to recommender a list with N items for each user to fit his/her potential need to maximum level. Relevant studies demonstrate that this task is closer to real world scenario than rating prediction. It means that rating error metrics, such as Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) are difficult to reflect the performance of recommender systems exactly [39].

Therefore, we choose two ranking-based metrics to evaluate performance in our experiments: Mean Average Precision (MAP@N) and Normalized Discounted Cumulative Gain (NDCG@N). These two metrics take the position influence of predicted list into account and are widely used in many commercial systems [40].

Mean Average Precision. MAP is an improved metric of Precision by considering the performance at all positions of top-n item list. The Precision metric is defined by:

[TABLE]

Where $I_{u}$ represents the items that user $u$ has rated in the test set; $\hat{I}_{N,u}$ indicates the $N$ items with highest predicted score in unrated item set for user $u$ . Then we have the metric of Average Precision, which is defined as following:

[TABLE]

Where $rel(k)$ indicate whether the item at rank $k$ is adopted or not. The Mean Average Precision (MAP@N) is the mean value of AP@N across all users.

Normalized Discounted Cumulative Gain. This metric considers the performance at all positions on the predicted list by giving higher weights to the items with higher predicted scores. Especially, the NDCG is defined as:

[TABLE]

Where $Z_{n}$ is the normalized term over the ideal value iDCG. The mean value of NDCG@N across all users is reported in our experiments.

4.3 Comparisons with previous models

Since we focus on the top-n recommendation task, it is unsuitable to compare with the methods designed for rating prediction tasks, such as SVD++ [41] and TrustSVD [42]. In this section, we compare the proposed CoDAE model with several state-of-the-art top-n recommendation algorithms as following:

•

Pop: This is a popular baseline algorithm which make predictions based on how many people have rated on a particular item.

•

BPR [36]. This is a ranking based algorithm which is implemented by learning the relations between positive and negative items for each user.

•

GBPR [43]. This method relaxes the individual and independence hypothesis in BPR model. The authors propose a new improved model by taking group influence into account. The group size is fixed to 5 as suggested in [43].

•

SBPR [44]. This work proposes to improve the BPR model by considering social connections with the assumption that users tend to assign higher ranks to items that their friends like.

•

CDAE [27]. The authors utilize the Denoising Autoencoder (DAE) technique to improve top-n recommendation performance. Especially, they propose to inject user-special vectors into hidden layer of DAE network to further improve recommendation accuracy.

•

TDAE [33]. This is a novel deep learning based social recommendation method. In this work, the authors propose to learn user representations from both original and inferred data of user ratings and social networks. They also propose to inject social information into the hidden layer of neural network to further improve recommendation.

As we focus on learning user preferences with roles of rater, truster and trustee by Denoising Autoencoders, we mainly compared with CF-based and DAE-based methods. We leave out the DLMF method in [28], since the performance differences may be result from the community effect which considered in [28] but not in the proposed CoDAE model. Moreover, for that the TDAE model is developed for rating prediction task, we utilize the logistic loss function and sampling strategy of 5 that used in CDAE to adjust the TDAE model for item recommendation task.

For all comparison methods, we tune the hyper-parameters carefully by trial and error method in our experiments or according to corresponding reference for each dataset to ensure that each method achieves the best performance for fair comparisons. For each user, we randomly choice $80\%$ of the rating data for training process and the others for testing process. We conduct each experiment for 5 times and report the mean performance to fairly compare with other methods. For CDAE and CoDAE methods, we set the drop probability $q$ to 0.1 and set sample rate $t$ in Equation 5 to 5 according to the experimental results in [27].

4.3.1 Validations on all users

The comparison results on all users are demonstrated in Table 3. We use two ranking-based metrics to evaluate the recommendation accuracy: MAP@10 and NDCG@10. This table shows the comparison results with dimensions of 10,20 and 100 on two public datasets: Ciao and Epinions.

In Table 3, we can see that the CoDAE model performs better than other comparison methods at least of $0.69\%$ and $0.37\%$ for all datasets on metrics of MAP@10 and NDCG@10, respectively. This fact shows that it is effective to improve recommendation by considering correlations among user features with multiple roles. Moreover, the improvements of CoDAE is more significant while the dimensionality of k growing from 10 to 100. This may indicate that it is easier to learn correlations of user features with multiple roles by utilizing a larger value of dimensionality. In addition, the improvements of CoDAE on Ciao dataset are much more significant than that on Epinions dataset. It may be caused of that the data of Epinions are much sparser than that of Ciao as demonstrated in Table 2, which makes it more difficult to learn correlations of user features by CoDAE model.

4.3.2 Validations on sparse users

It is well known that the CF-based recommender systems are facing critical data sparse problem, which may degrade the recommendation performance. To validate the performance of CoDAE model in views of sparse users, we demonstrate the performance of users with different rating numbers on metric of MAP@10 with $k=100$ in Figure 4. Since the evaluation results on metric of MAP@10 is consistent with other ranking-based metrics, we omit the performance on other metrics.

As we can see in Figure 4, the performance of all methods gets better with the number of ratings increases. It indicates that the amount of rating data has a great impact on the recommendation performance and the sparse problem is a critical problem for the recommender systems. Moreover, the CoDAE model performs better than other comparison methods for users with different numbers of ratings. It proves that the CoDAE model is quite effective to address the data sparse problem for recommendation by taking correlations of user features into consideration. In addition, compared with TDAE, the improvements of CoDAE for users with no more than 10 ratings are quite small on both Ciao and Epinions datasets. For that it is difficult to learn robust features from users with too few ratings, it is not easy to learn correlations from these unreliable user features.

4.4 Impact of shared parameters

We conduct a series of experiments to study the influence of shared parameters in Equation (6). Especially, $\alpha$ is used to control the influence of shared parameters M in CoDAE. A larger value of $\alpha$ indicates more influence of shared parameters.

In Figure 5, we demonstrate the experimental results of CoDAE model with $\alpha\in{0,0.1,...,1}$ . As we can see, the CoDAE model achieves best performance with $\alpha=0.6$ in both datasets of Ciao and Epinions. This fact indicates that it is necessary to make use of shared parameters to mine common information of the input vectors for each user with roles of rater, truster and trustee. Moreover, the performance curve of CoDAE in Figure 5 is not quite flat. This may be caused by the stochastic training strategy of CoDAE model. Therefore, the best values of $\alpha$ for different datasets should be determined according to experimental results for each dataset.

4.5 Impact of related regularization

Parameter $\beta$ are used to control the influence of related regularization. Larger values of $\beta$ indicate that there exist more correlations between user features of multiple roles. We conduct a group of experiments to evaluate the influence of related regularization term and report the results in Table 4. We perform these experiments with $k=100$ .

In this table, we can see that the performance of CoDAE increases with the value of $\beta$ grows and up to a point. When the value of $\beta$ become even larger, the performance decreases due to overfitting. These experimental results indicate that the social influence makes a great impact on the recommendation accuracy and the value of $\beta$ should be adjusted for different datasets.

4.6 Impact of dropout probability

To study the influence of $q$ in CoDAE, we conduct a series of experiments to show the results with different values of $q$ . The parameter $q$ are used control the dropout probability in Equation (1). A larger value of $q$ indicates the entries of input vectors are more likely to be dropout by replacing original values with zeros. If $q=0$ , this dropout method makes no impact on the entries of input vectors.

As demonstrated in Figure 6, with the value of $q$ increases, the performance of CoDAE grows to a particular point and then drops down for both datasets of Ciao and Epinions. Especially, the CoDAE model achieves best performance with $q=0.1$ . Moreover, the performance of CoDAE with $q=0$ are slightly worse than that with $q=0.1$ for these two datasets. This is because of that the inputs of user ratings and social networks are facing critical problem, which makes it drops too much useful information by dropout method with a large value of $q=0$ .

4.7 Impact of dimension $k$

In this subsection, we study the impact of latent dimension $k$ on Ciao and Epinions datasets and report the results in Table 5. As demonstrated in table 5, we can see that the performance of CoDAE is increasing with $k$ getting larger on both datasets. We can also see that the improvement degree is getting smaller with $k$ grows or even getting worse due to overfitting, such as the result on Ciao and Epinions datasets with $k=200$ .

5 Conclusions

In this paper, we propose a novel top-n recommendation algorithm CoDAE by modeling users with roles of raters, truster and trustee. Especially, we propose a tightly coupled structure CoDAE to learn compact and high-level influence from rating and trust data for each user. Moreover, this structure contains three DAE networks, the mid-layers of which are tightly interconnected through a novel related regularization term. This model is very flexible and easily be extended for other kinds of auxiliary information or other applications. We further conduct comprehensive experiments to compare the CoDAE model with other state-of-the-art algorithms. We also study the influence of hyper-parameters in CoDAE model.

In the further, we intend to develop the proposed model for at least three potential directions but not limited. First, to further overcome the sparse problem of rating and trust data, it is necessary to introduce more information for recommender systems, such as images or videos. We intend to make use of the recent techniques [11] to improve accuracy for recommender systems. Second, distinct from that in computer vision domain, the rating data used in recommender systems are very sparse. This fact makes it difficult make full use of GPU power by most existing deep learning frameworks, such as TensorFlow or pyTorch. This problem appeals for more attentions to introduce multi-core CPU or many-core GPU power [45, 46] for recommender systems. Third, the proposed CoDAE is a very flexible model to be extended for other kinds of applications, such as CAD/CAM [47, 48], social computing [49, 50] and intelligent computing [51, 52].

Acknowledgments

This research has been supported by the National Science Foundation of China (Grant No.61472289) and the National Key Research and Development Project (Grant No.2016YFC0106305).

Bibliography52

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Leng J W, Jiang P Y. A deep learning approach for relationship extraction from interaction context in social manufacturing paradigm. Knowledge-Based Systems, 2016, 100: 188-199
2[2] Wu Y Q, He F Z, Zhang D J, Li X X. Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Transactions on Services Computing, 2018, 11: 341-353
3[3] Ma H, Yang H X, Lyu M R, King I. Sorec: social recommendation using probabilistic matrix factorization. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, New York, 2008. 931–940
4[4] Jamali M, Ester M. A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the Fourth ACM Conference on Recommender Systems, New York, 2010. 135–142
5[5] Ma H, Zhou T C, Lyu M R, King I. Improving recommender systems by incorporating social contextual information. ACM Transactions on Information Systems (TOIS), 2011, 29: 9
6[6] Yang B, Lei Y, Liu D Y, Liu J M. Social collaborative filtering by trust. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, Beijing, 2013. 2747–2753
7[7] Yao W L, He J, Huang G G, Zhang Y C. Modeling dual role preferences for trust-aware recommendation. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, New York, 2014. 975–978
8[8] Jiang M, Cui P, Liu R, Yang Q, Wang F, Zhu W W, Yang S Q. Social contextual recommendation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Maui, 2012. 45–54

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

A Correlative Denoising Autoencoder to Model Social Influence for Top-N Recommender System

Abstract

doi:

1 Introduction

2 Related Work

2.1 Social recommendation

2.2 Deep learning for recommendation

2.3 Deep learning for social recommendation

2.4 Heterogeneous network based recommendation

3 Proposed method

3.1 Problem Description

3.2 Collaborative Denoising Autoencoder

3.3 Correlative Denoising Autoencoder

3.4 Model learning

3.5 Model complexity analysis

4 Experiments and Results

4.1 Datasets

4.2 Evaluation Metrics

4.3 Comparisons with previous models

4.3.1 Validations on all users

4.3.2 Validations on sparse users

4.4 Impact of shared parameters

4.5 Impact of related regularization

4.6 Impact of dropout probability

4.7 Impact of dimension kkk

5 Conclusions

Acknowledgments

4.7 Impact of dimension $k$