TL;DR
This paper introduces a deep metric learning method for linking social media accounts of the same user by embedding their activity into a vector space, achieving high accuracy without needing annotated data.
Contribution
It presents a novel embedding approach that handles variable-sized user activity samples and does not require human-annotated training data.
Findings
Outperforms several baseline methods in account linking accuracy
Effective with small, unseen account samples
Leverages large-scale social media content without annotated data
Abstract
We consider the task of linking social media accounts that belong to the same author in an automated fashion on the basis of the content and metadata of their corresponding document streams. We focus on learning an embedding that maps variable-sized samples of user activity -- ranging from single posts to entire months of activity -- to a vector space, where samples by the same author map to nearby points. The approach does not require human-annotated data for training purposes, which allows us to leverage large amounts of social media content. The proposed model outperforms several competitive baselines under a novel evaluation framework modeled after established recognition benchmarks in other domains. Our method achieves high linking accuracy, even with small samples from accounts not seen at training time, a prerequisite for practical applications of the proposed linking framework.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
