TL;DR
This paper proposes an unsupervised user embedding method based on re-identification to detect community membership on social media using only text features, especially effective with limited positive examples.
Contribution
It introduces a novel unsupervised proxy task for learning user embeddings that improves community detection accuracy with minimal labeled data.
Findings
Embeddings outperform common unsupervised representations in community detection.
The method is effective across 16 different social media communities.
User re-identification as a proxy task enhances community membership identification.
Abstract
This paper addresses the problem of community membership detection using only text features in a scenario where a small number of positive labeled examples defines the community. The solution introduces an unsupervised proxy task for learning user embeddings: user re-identification. Experiments with 16 different communities show that the resulting embeddings are more effective for community membership identification than common unsupervised representations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
