Automated PII Extraction from Social Media for Raising Privacy Awareness: A Deep Transfer Learning Approach
Yizhi Liu, Fang Yu Lin, Mohammadreza Ebrahimi, Weifeng Li, Hsinchun, Chen

TL;DR
This paper introduces DTL-PIIE, a deep transfer learning framework that improves automatic extraction of Personally Identifiable Information from social media, addressing data scarcity and variability issues with innovative techniques.
Contribution
The study proposes a novel transfer learning framework using GCNs for PII extraction, overcoming data and embedding limitations in social media privacy analysis.
Findings
Outperforms state-of-the-art DL IE models
Effectively transfers knowledge from public PII data
Utilizes GCNs to incorporate syntactic patterns
Abstract
Internet users have been exposing an increasing amount of Personally Identifiable Information (PII) on social media. Such exposed PII can cause severe losses to the users, and informing users of their PII exposure is crucial to raise their privacy awareness and encourage them to take protective measures. To this end, advanced automatic techniques are needed. While Information Extraction (IE) techniques can be used to extract the PII automatically, Deep Learning (DL)-based IE models alleviate the need for feature engineering and further improve the efficiency. However, DL-based IE models often require large-scale labeled data for training, but PII-labeled social media posts are difficult to obtain due to privacy concerns. Also, these models rely heavily on pre-trained word embeddings, while PII in social media often varies in forms and thus has no fixed representations in pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Social Media and Politics · Mental Health via Writing
