Data Leakage via Access Patterns of Sparse Features in Deep   Learning-based Recommendation Systems

Hanieh Hashemi; Wenjie Xiong; Liu Ke; Kiwan Maeng; Murali Annavaram,; G. Edward Suh; Hsien-Hsin S. Lee

arXiv:2212.06264·cs.CE·December 14, 2022·5 cites

Data Leakage via Access Patterns of Sparse Features in Deep Learning-based Recommendation Systems

Hanieh Hashemi, Wenjie Xiong, Liu Ke, Kiwan Maeng, Murali Annavaram,, G. Edward Suh, Hsien-Hsin S. Lee

PDF

Open Access

TL;DR

This paper investigates how access patterns to sparse feature embeddings in cloud-based recommendation systems can leak private user information, highlighting potential security risks despite existing privacy-preserving methods.

Contribution

It characterizes attack types on sparse feature access patterns in recommendation models and demonstrates how these can lead to user privacy breaches.

Findings

01

Access patterns can reveal user behavior and private data.

02

Attacks can track users over time through embedding table access.

03

Current privacy methods may not fully prevent access pattern leakage.

Abstract

Online personalized recommendation services are generally hosted in the cloud where users query the cloud-based model to receive recommended input such as merchandise of interest or news feed. State-of-the-art recommendation models rely on sparse and dense features to represent users' profile information and the items they interact with. Although sparse features account for 99% of the total model size, there was not enough attention paid to the potential information leakage through sparse features. These sparse features are employed to track users' behavior, e.g., their click history, object interactions, etc., potentially carrying each user's private information. Sparse features are represented as learned embedding vectors that are stored in large tables, and personalized recommendation is performed by using a specific user's sparse feature to index through the tables. Even with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Advanced Graph Neural Networks