# Comparative Analysis of Content-based Personalized Microblog   Recommendations [Experiments and Analysis]

**Authors:** Efi Karra Taniskidou, George Papadakis, George Giannakopoulos, Manolis, Koubarakis

arXiv: 1901.05497 · 2019-01-18

## TL;DR

This paper systematically evaluates various content-based microblog recommendation methods, analyzing how different text representations, data sources, and user activity types influence performance on a large Twitter dataset.

## Contribution

It introduces a comprehensive evaluation of 9 representation models, 13 data sources, and 3 user types, along with a new taxonomy for understanding these models' performance.

## Key findings

- Representation models significantly impact recommendation quality.
- Source of microblog posts affects personalization effectiveness.
- User activity type influences model performance.

## Abstract

Microblogging platforms constitute a popular means of real-time communication and information sharing. They involve such a large volume of user-generated content that their users suffer from an information deluge. To address it, numerous recommendation methods have been proposed to organize the posts a user receives according to her interests. The content-based methods typically build a text-based model for every individual user to capture her tastes and then rank the posts in her timeline according to their similarity with that model. Even though content-based methods have attracted lots of interest in the data management community, there is no comprehensive evaluation of the main factors that affect their performance. These are: (i) the representation model that converts an unstructured text into a structured representation that elucidates its characteristics, (ii) the source of the microblog posts that compose the user models, and (iii) the type of user's posting activity. To cover this gap, we systematically examine the performance of 9 state-of-the-art representation models in combination with 13 representation sources and 3 user types over a large, real dataset from Twitter comprising 60 users. We also consider a wide range of 223 plausible configurations for the representation models in order to assess their robustness with respect to their internal parameters. To facilitate the interpretation of our experimental results, we introduce a novel taxonomy of representation models. Our analysis provides novel insights into the performance and functionality of the main factors determining the performance of content-based recommendation in microblogs.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.05497/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1901.05497/full.md

## References

64 references — full list in the complete paper: https://tomesphere.com/paper/1901.05497/full.md

---
Source: https://tomesphere.com/paper/1901.05497