Reducing Privacy Risks in Online Self-Disclosures with Language Models
Yao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan, Ritter, Wei Xu

TL;DR
This paper develops a language model-based system to detect and abstract online self-disclosures, aiming to reduce privacy risks while maintaining utility, validated through model performance and user studies.
Contribution
It introduces a taxonomy of self-disclosure categories, a large annotated corpus, and a fine-tuned language model for detection and abstraction, advancing privacy-preserving social media tools.
Findings
Detection model achieves over 65% partial span F1
User study shows 82% positive perception of the model
Abstraction models moderately reduce privacy risks while preserving utility
Abstract
Self-disclosure, while being common and rewarding in social media interaction, also poses privacy risks. In this paper, we take the initiative to protect the user-side privacy associated with online self-disclosure through detection and abstraction. We develop a taxonomy of 19 self-disclosure categories and curate a large corpus consisting of 4.8K annotated disclosure spans. We then fine-tune a language model for detection, achieving over 65% partial span F. We further conduct an HCI user study, with 82% of participants viewing the model positively, highlighting its real-world applicability. Motivated by the user feedback, we introduce the task of self-disclosure abstraction, which is rephrasing disclosures into less specific terms while preserving their utility, e.g., "Im 16F" to "I'm a teenage girl". We explore various fine-tuning strategies, and our best model can generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy, Security, and Data Protection · Privacy-Preserving Technologies in Data · Access Control and Trust
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Attention Dropout · Dense Connections · Cosine Annealing · Adam · 15 Ways to Contact How can i speak to someone at Delta Airlines
