Identifying Users From Their Rating Patterns
Jos\'e Bento, Nadia Fawaz, Andrea Montanari, Stratis Ioannidis

TL;DR
This study demonstrates that temporal rating patterns are highly effective for identifying individual users within households in a large movie rating dataset, achieving around 96% accuracy.
Contribution
The paper introduces a model that leverages temporal information over user preferences to accurately identify users in household-based rating data.
Findings
Temporal information is more useful than ratings for user identification.
Achieved approximately 96% accuracy in user identification within households.
Model outperforms baseline approaches using rating preferences.
Abstract
This paper reports on our analysis of the 2011 CAMRa Challenge dataset (Track 2) for context-aware movie recommendation systems. The train dataset comprises 4,536,891 ratings provided by 171,670 users on 23,974$ movies, as well as the household groupings of a subset of the users. The test dataset comprises 5,450 ratings for which the user label is missing, but the household label is provided. The challenge required to identify the user labels for the ratings in the test set. Our main finding is that temporal information (time labels of the ratings) is significantly more useful for achieving this objective than the user preferences (the actual ratings). Using a model that leverages on this fact, we are able to identify users within a known household with an accuracy of approximately 96% (i.e. misclassification rate around 4%).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Human Mobility and Location-Based Analysis
