Automated PII Extraction from Social Media for Raising Privacy   Awareness: A Deep Transfer Learning Approach

Yizhi Liu; Fang Yu Lin; Mohammadreza Ebrahimi; Weifeng Li; Hsinchun; Chen

arXiv:2111.09415·cs.SI·November 19, 2021

Automated PII Extraction from Social Media for Raising Privacy Awareness: A Deep Transfer Learning Approach

Yizhi Liu, Fang Yu Lin, Mohammadreza Ebrahimi, Weifeng Li, Hsinchun, Chen

PDF

Open Access

TL;DR

This paper introduces DTL-PIIE, a deep transfer learning framework that improves automatic extraction of Personally Identifiable Information from social media, addressing data scarcity and variability issues with innovative techniques.

Contribution

The study proposes a novel transfer learning framework using GCNs for PII extraction, overcoming data and embedding limitations in social media privacy analysis.

Findings

01

Outperforms state-of-the-art DL IE models

02

Effectively transfers knowledge from public PII data

03

Utilizes GCNs to incorporate syntactic patterns

Abstract

Internet users have been exposing an increasing amount of Personally Identifiable Information (PII) on social media. Such exposed PII can cause severe losses to the users, and informing users of their PII exposure is crucial to raise their privacy awareness and encourage them to take protective measures. To this end, advanced automatic techniques are needed. While Information Extraction (IE) techniques can be used to extract the PII automatically, Deep Learning (DL)-based IE models alleviate the need for feature engineering and further improve the efficiency. However, DL-based IE models often require large-scale labeled data for training, but PII-labeled social media posts are difficult to obtain due to privacy concerns. Also, these models rely heavily on pre-trained word embeddings, while PII in social media often varies in forms and thus has no fixed representations in pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Social Media and Politics · Mental Health via Writing