Matching Media Contents with User Profiles by means of the   Dempster-Shafer Theory

Luigi Troiano; Irene D\'iaz; Ciro Gaglione

arXiv:1704.03048·cs.AI·April 12, 2017

Matching Media Contents with User Profiles by means of the Dempster-Shafer Theory

Luigi Troiano, Irene D\'iaz, Ciro Gaglione

PDF

Open Access

TL;DR

This paper proposes a model based on Dempster-Shafer Theory of Evidence to match media content with user profiles, aiming to improve personalized content delivery in the media industry.

Contribution

It introduces a novel reference model utilizing Dempster-Shafer Theory for matching media content with user profiles, with potential applications in personalized media services.

Findings

01

Demonstrated the model with a toy example

02

Outlined potential applications in media personalization

03

Highlighted properties of the proposed model

Abstract

The media industry is increasingly personalizing the offering of contents in attempt to better target the audience. This requires to analyze the relationships that goes established between users and content they enjoy, looking at one side to the content characteristics and on the other to the user profile, in order to find the best match between the two. In this paper we suggest to build that relationship using the Dempster-Shafer's Theory of Evidence, proposing a reference model and illustrating its properties by means of a toy example. Finally we suggest possible applications of the model for tasks that are common in the modern media industry.

Tables4

Table 1. TABLE I: Structure of dataset assumed by the model

				$P_{q}$	$p_{1, q}$	$p_{2, q}$	$p_{3, q}$	$\dots$	$p_{n, q}$
				$P_{q}$	$p_{1, q}$	$p_{2, q}$	$p_{3, q}$	$\dots$	$p_{n, q}$
				$P_{q}$	$p_{1, q}$	$p_{2, q}$	$p_{3, q}$	$\dots$	$p_{n, q}$
				$⋮$	$⋮$	$⋮$	$⋮$	$⋱$	$⋮$
				$P_{q}$	$p_{1, q}$	$p_{2, q}$	$p_{3, q}$	$\dots$	$p_{n, q}$
$C_{1}$	$C_{2}$	…	$C_{p}$		$1$	$2$	$3$	…	$n$
$c_{1, 1}$	$c_{1, 2}$	$\dots$	$c_{1, p}$	1	$✓$	$✓$		$\dots$
$c_{2, 1}$	$c_{2, 2}$	$\dots$	$c_{2, p}$	2	$✓$		$✓$	$\dots$
$c_{3, 1}$	$c_{3, 2}$	$\dots$	$c_{3, p}$	3		$✓$	$✓$	$\dots$	$✓$
$⋮$	$⋮$	$⋱$	$⋮$	$⋮$	$⋮$	$⋮$	$⋮$	$⋱$	$⋮$
$c_{m, 1}$	$c_{m, 2}$	$\dots$	$c_{m, p}$	m	$✓$	$✓$		$\dots$

Table 2. TABLE II: The dataset used as example.

				Age	30s	30s	20s	40s
				Gender	M	F	M	M
				Location	IT	IT	SP	IT
\cdashline5-9						Interests	Movies Books	Sport	Books	Music Sport
Director	Year	Stars	Genre		1	2	3	4
Boyle	1996	Ewan McGregor, Ewen Bremner	Drama	0	$✓$	$✓$	$✓$
Levinson	1996	Robert De Niro, Kevin Bacon, Brad Pitt	Crime, Drama, Thriller	1	$✓$			$✓$
Scorsese	2015	Robert De Niro, Leonardo DiCaprio, Brad Pitt	Short, Comedy	2		$✓$
Scorsese	1990	Robert De Niro, Ray Liotta, Joe Pesci	Biography, Crime, Drama	3			$✓$
Boyle	2000	Leonardo DiCaprio	Adventure, Drama, Romance	4			$✓$
Howard	1995	Tom Hanks, Kevin Bacon	Adventure, Drama, History	5		$✓$		$✓$
Zemeckis	1994	Tom Hanks	Comedy, Drama	6	$✓$
Zemeckis	1985	Michael J. Fox, Christopher Lloyd	Adventure, Sci-Fi	7				$✓$
Edwards	2016	Felicity Jones, Diego Luna	Adventure, Sci-Fi	8		$✓$	$✓$
Scott	2015	Matt Damon	Adventure, Drama, Sci-Fi	9		$✓$

Table 3. TABLE III: Overall sets of item characteristics

Director	Year	Actors	Genre
Boyle	1996	Ewan McGregor	Crime
Levinson	2015	Ewen Bremner	Drama
Scorsese	1990	Ray Liotta	Thriller
Howard	2000	Robert De Niro	Short
Zemeckis	1995	Kevin Bacon	Comedy
Edwards	1994	Brad Pitt	Biography
Scott	1985	Leonardo DiCaprio	Adventure
	2016	Joe Pesci	Romance
		Ray Liotta	History
		Tom Hanks	Sci-Fi
		Michael J. Fox
		Christopher Lloyd
		Felicity Jones
		Diego Luna
		Matt Damon

Table 4. TABLE IV: Overall sets of user profiling features

Age	Gender	Location	Interests
20s	M	IT	Books
30s	F	SP	Movies
40s		SP	Sport
			Music

Equations38

m (\emptyset) = 0 an d A \in 2^{Ω} \sum m (A) = 1

m (\emptyset) = 0 an d A \in 2^{Ω} \sum m (A) = 1

B e l (A) = B \subseteq A \sum m (B)

B e l (A) = B \subseteq A \sum m (B)

P l (A) = B \cap A \neq = \emptyset \sum m (B)

P l (A) = B \cap A \neq = \emptyset \sum m (B)

P l (A) = 1 - B e l (\overline{A})

P l (A) = 1 - B e l (\overline{A})

m_{1, 2} (A) = \frac{1}{1 - Z} B \cap C = A \sum m_{1} (B) \cdot m_{2} (C)

m_{1, 2} (A) = \frac{1}{1 - Z} B \cap C = A \sum m_{1} (B) \cdot m_{2} (C)

Z = B \cap C = \emptyset \sum m_{1} (B) \cdot m_{2} (C)

Z = B \cap C = \emptyset \sum m_{1} (B) \cdot m_{2} (C)

0 \leq m_{B C} \leq m_{A D} \leq m_{B C} + m_{A D} \leq m_{C} \leq m_{B C} + m_{C} \leq m_{A D} + m_{C} \leq m_{B C} + m_{A D} + m_{C} = 1

0 \leq m_{B C} \leq m_{A D} \leq m_{B C} + m_{A D} \leq m_{C} \leq m_{B C} + m_{C} \leq m_{A D} + m_{C} \leq m_{B C} + m_{A D} + m_{C} = 1

C r (A) = \joinrel = def {B \in F (Ω) ∣ B \subseteq A}

C r (A) = \joinrel = def {B \in F (Ω) ∣ B \subseteq A}

S u (A) = \joinrel = def {B \in F (Ω) ∣ B \cap A \neq = \emptyset}

S u (A) = \joinrel = def {B \in F (Ω) ∣ B \cap A \neq = \emptyset}

E_{C r} = \joinrel = def {A \subseteq Ω ∣ C r (A) = C r}

E_{C r} = \joinrel = def {A \subseteq Ω ∣ C r (A) = C r}

E_{S u} = \joinrel = def {A \subseteq Ω ∣ S u (A) = S u}

E_{S u} = \joinrel = def {A \subseteq Ω ∣ S u (A) = S u}

m (K) = \frac{∣ L ( K ) ∣}{∣ L ∣}

m (K) = \frac{∣ L ( K ) ∣}{∣ L ∣}

B e l (A d v e n t u r e, C o m e d y, S c i - F i, D r ama) = m (A d v e n t u r e, D r ama, S c i - F i) + m (A d v e n t u r e, S c i - F i) + m (C o m e d y, D r ama) + m (D r ama) = \frac{1}{15} + \frac{3}{15} + \frac{1}{15} + \frac{3}{15} = \frac{8}{15}

B e l (A d v e n t u r e, C o m e d y, S c i - F i, D r ama) = m (A d v e n t u r e, D r ama, S c i - F i) + m (A d v e n t u r e, S c i - F i) + m (C o m e d y, D r ama) + m (D r ama) = \frac{1}{15} + \frac{3}{15} + \frac{1}{15} + \frac{3}{15} = \frac{8}{15}

P l (A d v e n t u r e, C o m e d y, S c i - F i, D r ama) =

P l (A d v e n t u r e, C o m e d y, S c i - F i, D r ama) =

m (K) = \frac{∣ L ( K _{1} ) \cap L ( K _{2} ) ∣}{∣ L ∣}

m (K) = \frac{∣ L ( K _{1} ) \cap L ( K _{2} ) ∣}{∣ L ∣}

B e l (Z e m ec k i s ⊙ D r ama) = m (Z e m ec k i s ⊙ D r ama) = \frac{1}{15}

B e l (Z e m ec k i s ⊙ D r ama) = m (Z e m ec k i s ⊙ D r ama) = \frac{1}{15}

P l (Z e m ec k i s ⊙ D r ama) = m (Z e m ec k i s ⊙ D r ama) = \frac{1}{15}

P l (Z e m ec k i s ⊙ D r ama) = m (Z e m ec k i s ⊙ D r ama) = \frac{1}{15}

m (K) = \frac{∣ L ( K _{1} ) \cup L ( K _{2} ) ∣}{∣ L ∣}

m (K) = \frac{∣ L ( K _{1} ) \cup L ( K _{2} ) ∣}{∣ L ∣}

B e l (S p or t ⊙ 20 s) = \frac{7}{15}

B e l (S p or t ⊙ 20 s) = \frac{7}{15}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimedia Communication and Technology

Full text

Matching Media Contents with User Profiles

by means of the Dempster-Shafer Theory

Luigi Troiano

University of Sannio

Department of Engineering

Benevento, Italy

Email: [email protected]

Irene Díaz

Oviedo University

Computer Science Department

Gijón, Spain

Email: [email protected]

Ciro Gaglione DISCLAIMER. This article was prepared or accomplished by Ciro Gaglione in his personal capacity. The opinions expressed in this paper are the authors’ own and do not reflect the view of Sky Italia. Sky Italia

Interactive Tv Lab

Milan, Italy

Email: [email protected]

Abstract

The media industry is increasingly personalizing the offering of contents in attempt to better target the audience. This requires to analyze the relationships that goes established between users and content they enjoy, looking at one side to the content characteristics and on the other to the user profile, in order to find the best match between the two. In this paper we suggest to build that relationship using the Dempster-Shafer’s Theory of Evidence, proposing a reference model and illustrating its properties by means of a toy example. Finally we suggest possible applications of the model for tasks that are common in the modern media industry.

I Introduction

Digital technologies are radically changing the way of performing business in media industry, with new possibilities of tailoring the catalog so that everybody has the chance of enjoying contents that best fit his/her interests, often on demand, at the time that is most appropriate for each user. Such a change is requiring to reformulate the way of building the content offering. Data collected from customers regarding their profile and preferences become central, so models able to interpret and to reason about data.

These models aims to discover and exploit the relationship that stands between users and media contents they enjoy. Here the problem is not to ask directly the user what are his/her interests and preferences, but to infer them by looking at those contents they access and to the feedback they provide about them. The ultimate goal is to learn a model from data able to link user to the vast catalog of contents made available by a large media company.

Looking at past interactions is useful to help users to discover contents that they would appreciate as valuable part of the product they paid for. This means to improve the customer retention and foster their upgrade towards more profitable products. The benefits coming from the implementation and use of these models go beyond existing contents and customers. They also help to propose new contents to existing customers, and on the other way to support new customers in discovering existing contents. Soon, new contents and new customers become part of the model, enriching the dataset of new entities, along a self-growing process. Predictiveness of models make them also suitable to support the acquisition of new contents and customers.

These models are at the core logic of recommender systems (RS), that obtained large attention once Netflix showed potentiality of algorithms in developing and supporting their streaming platform [1]. Recommender systems gained large application because of the e-commerce diffusion. They are generally grouped in different types, including Content-based recommenders [2], Collaborative recommenders [3], Demographic recommenders [4], and Hybrid recommenders [5].

The purpose of a recommender system is to provide a suggestion, regarding available alternatives, by scoring and ranking them according to the user preferences. In order to accomplish its task, a recommender system requires information regarding the user profile and habits with respect to the different alternatives that can be proposed to him. This information can be acquired explicitly by asking the users to rate items or implicitly by monitoring users’ behavior (booked hotels or heard songs). RS can also use other kinds of information as demographic features (e.g, age, gender) or social information. The research related to RS has been focused on movies, music and books [6], being music recommendations the most studied topic, although later it has been applied to other e-commerce domains [7].

Similar to RS, we need data about user likings regarding catalog items such as movies, series and shows. Such information can be gathered by asking the user to rate the items, e.g., by using stars or likes, or implicitly by monitoring the customer behavior, e.g., which item enjoyed fully an which partially, how often they accessed the content description, etc. In addition we need other information regarding demographics such age, gender, family members, job, etc. The objective is to relate user profiles to content descriptors. Different techniques have been experimented in order to discover and exploit this relationship. Most of them take the form of information fusion.

Following the idea explored by [8], and more concretely the model developed in [9], we aim to build a relationship model based on the Dempster-Shafer’s Theory of Evidence (D-S theory) [10, 11] and to use it to make inference regarding the relationship between users and contents. The reminder of this paper is organized as follows: Section II provides some preliminaries regarding D-S Theory; Section III describes the model; Section IV outlines some examples of application; Section V draws conclusions and future directions.

II Preliminaries

The Dempster-Shafer theory, also known as the Theory of Evidence [10, 11], is used as basis for the preference model presented in [9]. In D-S theory, basic probabilities are allocated to subsets, instead of elements, according to the following definitions.

Definition 1.

A function $m:2^{\Omega}\longrightarrow[0,1]$ over a set $\Omega$ is called a basic probability assignment if

[TABLE]

Definition 2.

Let $\Omega$ be a set, then $A\subseteq\Omega$ is a focal element if $m(A)>0$ . In addition, $F(\Omega)\subset 2^{\Omega}$ represents the set of focal elements induced by $m$ .

Definition 3.

Let $m$ be a basic probability assignment function over a set $\Omega$ . The Belief of $A\subseteq\Omega$ induced by $m$ is defined as follows

[TABLE]

Definition 4.

Let $m$ be a basic probability assignment function over a set $\Omega$ . The Plausibility of $A\subseteq\Omega$ induced by $m$ is defined as follows

[TABLE]

The relationship between Plausibility and Belief is given by the following equation:

[TABLE]

where $\overline{A}$ is the complement of $A$ to $\Omega$ .

When the probability basic assignments are given by different sources, it is possible to combine them. The first and most common combination method is known as the Dempster’s rule, that is defined as follows:

Definition 5.

Let $m_{1}$ and $m_{2}$ be two basic probability assignments, the joint basic probability assignment is computed as

[TABLE]

where

[TABLE]

is a measure of conflict between the two basic probability assignment sets. In addition, it is assumed $m_{1,2}(\emptyset)=0$ .

Belief and Plausibility are monotonic functions with respect to inclusion. This means that if we consider the lattice of $\Omega$ subsets, as shown in Fig. 1, Belief and Plausibility will increase from bottom ( $Bel(\emptyset)=Pl(\emptyset)=0$ ) to top ( $Bel(\Omega)=Pl(\Omega)=1$ ). In particular Belief and Plausibility will be kept constant as far as we move to nodes that do not a probability mass assigned to them. As consequence of this property, we can identify regions of connected nodes, each assuming a specific value of Belief or Plausibility, as illustrated by Fig. 2.

In this example, focal elements are $C$ , $BC$ and $AD$ with the associated basic probability assignments $m_{C}$ , $m_{BC}$ and $m_{AD}$ (assuming $m_{C}+m_{BC}+m_{AD}=1$ ). This leads to identify 8 groups in the lattice, each with Belief and Plausibility depending from a focal subset of $F(\Omega)$ . Fig. 2 outlines these regions for both Belief and Plausibility. we can observe how all portions of lattice associated to a given value of Belief or Plausibility are connected.

If we sort the Belief (or Plausibility) values in ascending order, we get a sequence of levels, each grouping the nodes into those that are below the level and over the level. For instance, if we assume

[TABLE]

we get the situation depicted by Fig. 3 with respect to Plausibility. The following definitions enable the concept of classes of equivalence among the subsets with respect to Belief or Plausibility and to identify those elements that are most representative of the class.

Definition 6 (Core).

Given a subset $A\subseteq\Omega$ , the set of focal elements included in $A$ , core of $A$ , is defined as

[TABLE]

Definition 7 (Support).

Given a subset $A\subseteq\Omega$ and the set of focal elements (even partially in $A$ ), support of $A$ , is defined as

[TABLE]

For instance, according to the example in Fig. 1 $F(\Omega)=\{C,BC,AD\}$ , we have $Cr(BCD)=\{C,BC\}=Cr(BC)$ and $Su(ABD)=Su(BD)=Su(AB)=\{BC,AD\}$ . It is straightforward that $Cr(A)\subseteq Su(A)$ , for all $A\subseteq\Omega$ . The core and support represent the basis for computing respectively the Belief and the Plausibility of $A$ . The core and the support are able to group the subsets of $\Omega$ into classes of equivalence as the following definition states.

Definition 8 ( $Cr-$ and $Su-$ Equivalence).

Two sets $A$ and $B$ are said to be $Cr$ -equivalent if and only if $Cr(A)=Cr(B)=Cr$ . A $Cr$ -equivalence class is defined as the collection

[TABLE]

In addition, $A$ and $B$ are $Su$ -equivalent if and only if $Su(A)=Su(B)=Su$ . The $Su$ -equivalence class obtained from this relation. is defined as

[TABLE]

Fig. 4(a) provides an example of $Cr$ -equivalence class assuming as core $Cr=\{BC,C\}$ . Fig. 4(b) shows the $Su$ -equivalence class for the support $Su=\{BC,AD\}$ .

As an immediate consequence, if $A$ and $B$ are $Cr$ -equivalent, then $Bel(A)=Bel(B)$ , while if they are $Su$ -equivalent, $Pl(A)=Pl(B)$ .

$Cr-$ and $Su-$ equivalence classes perform a partitioning of $2^{\Omega}$ . Thus, each subset $X\subset\Omega$ can belong only to one equivalence class. Grouping subsets in $Cr-$ and $Su-$ equivalence classes allows (i) to explore the lattice by moving across classes, instead of exploring the whole item subset space, and (ii) to choose a representative of each class, so that the list of recommended items is shorter. For instance, we might be interested in using the smallest subset within a $Cr$ -equivalence class.

As representative of a $Cr-$ equivalence class we can assume the smallest subset. We call this set $Cr-$ minimal. For instance, for the class $\{BC,ABC,BCD\}$ , the core is $\{C,BC\}$ and the $Cr-$ minimal is $BC$ . It is possible to prove that each $Cr-$ equivalence class as one single $Cr-$ minimal. Conversely, for $Su-$ equivalence classes we assume as representative the largest subset, that we call $Su-$ maximal. Similarly to $Cr-$ equivalence classes, it is possible to prove that any $Su-$ equivalence class has one single $Su-$ maximal. For example, the class $\{C,AD\}$ , whose support is $\{ACD,AC,AD\}$ , as $ACD$ as maximal.

III Model

In the context of our interest we assume $I=\{I_{1},\ldots,I_{m}\}$ as the set of items belonging to the content catalog, while $U=\{U_{1},U_{2},\dots,U_{n}\}$ as the set of users.

Both sets are projected on two feature spaces, respectively made of $p$ and $q$ dimensions. The first is referred to the set of characteristics describing the items in $I$ , $C=\{C_{1},\dots,C_{p}\}$ , while the second to the user profiling $P=\{P_{1},\ldots,P_{q}\}$ . Both spaces are discrete, so that each $C_{i}$ and $P_{j}$ can assume a finite number of values.

The relationship between items and users is expressed by a choice matrix, as that shown in Tab. I. The choice matrix is places side by side to the item characteristics matrix (left side) and to the profile matrix (top).

In general, data points $c_{i,h}$ and $p_{j,k}$ are multi-valued, meaning that they are represented by sets of values. For instance if $C_{h}$ is representing the movie cast, $c_{i,h}$ is represented by the list of actors that are featuring in the movie $I_{i}$ . Similarly, if $P_{k}$ is ”interests”, $p_{j,k}$ will list what the user $U_{j}$ is interested in. In other cases they are single-valued, such as in the case of characteristics such as ”director” and ”year” or in the case of profiling features such as ”age” or ”location”. An example of this matrix is given in Tab.III.

Bibliography11

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] C. A. Gomez-Uribe and N. Hunt, “The netflix recommender system: Algorithms, business value, and innovation,” ACM Trans. Manage. Inf. Syst. , vol. 6, no. 4, pp. 13:1–13:19, Dec. 2015. doi: 10.1145/2843948. [Online]. Available: http://doi.acm.org/10.1145/2843948
2[2] J. Salter and N. Antonopoulos, “Cinemascreen recommender agent: Combining collaborative and content-based filtering,” IEEE Intelligent Systems , vol. 21, no. 1, pp. 35–41, Jan. 2006. doi: 10.1109/MIS.2006.4. [Online]. Available: http://dx.doi.org/10.1109/MIS.2006.4
3[3] L. Candillier, F. Meyer, and M. Boullé, “Comparing state-of-the-art collaborative filtering systems,” pp. 548–562, 2007.
4[4] M. J. Pazzani, “A framework for collaborative, content-based and demographic filtering,” Artif. Intell. Rev. , vol. 13, no. 5-6, pp. 393–408, Dec. 1999. doi: 10.1023/A:1006544522159. [Online]. Available: http://dx.doi.org/10.1023/A:1006544522159
5[5] R. Burke, “Knowledge-based recommender systems,” in Encyclopedia of Library and Information Science, vol. 69 , A. Kent, Ed. Taylor and Francis, 2000, pp. 180–201.
6[6] J. Bobadilla, F. Ortega, A. Hernando, and A. Gutiérrez, “Recommender systems survey,” Knowledge-Based Systems , vol. 46, no. 0, pp. 109 – 132, 2013. doi: 10.1016/j.knosys.2013.03.012. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S 0950705113001044
7[7] J. J. Castro-Schez, R. Miguel, D. Vallejo, and L. M. López-López, “A highly adaptive recommender system based on fuzzy logic for {B 2C} e-commerce portals,” Expert Systems with Applications , vol. 38, no. 3, pp. 2441 – 2454, 2011. doi: 10.1016/j.eswa.2010.08.033
8[8] K. Zhang and H. Li, “Fusion-based recommender system,” in Information Fusion (FUSION), 2010 13th Conference on , July 2010. doi: 10.1109/ICIF.2010.5712091 pp. 1–7.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Matching Media Contents with User Profiles

Abstract

I Introduction

II Preliminaries

Definition 1**.**

Definition 2**.**

Definition 3**.**

Definition 4**.**

Definition 5**.**

Definition 6** (Core).**

Definition 7** (Support).**

Definition 8** (Cr−Cr-Cr− and Su−Su-Su− Equivalence).**

III Model

Definition 1.

Definition 2.

Definition 3.

Definition 4.

Definition 5.

Definition 6 (Core).

Definition 7 (Support).

Definition 8 ( $Cr-$ and $Su-$ Equivalence).