Semi-Supervised Classification of Social Media Posts: Identifying Sex-Industry Posts to Enable Better Support for Those Experiencing Sex-Trafficking
Ellie Simonson

TL;DR
This paper explores semi-supervised machine learning methods to classify social media posts related to the sex industry, aiming to improve support for trafficking victims by leveraging limited labeled data and clustering techniques.
Contribution
It introduces a semi-supervised classification approach using FastText and Doc2Vec embeddings combined with clustering to identify sex industry posts on social media with high accuracy.
Findings
FastText CBOW achieved 98.6% accuracy on 12,000 posts
Semi-supervised learning can effectively label large social media datasets
Potential to develop tools for mapping sex industry activity on social media
Abstract
Social media is both helpful and harmful to the work against sex trafficking. On one hand, social workers carefully use social media to support people experiencing sex trafficking. On the other hand, traffickers use social media to groom and recruit people into trafficking situations. There is the opportunity to use social media data to better provide support for people experiencing trafficking. While AI and Machine Learning (ML) have been used in work against sex trafficking, they predominantly focus on detecting Child Sexual Abuse Material. Work using social media data has not been done with the intention to provide community level support to people of all ages experiencing trafficking. Within this context, this thesis explores the use of semi-supervised classification to identify social media posts that are a part of the sex industry. Several methods were explored for ML.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSex work and related issues · Sexuality, Behavior, and Technology · Gender, Feminism, and Media
MethodsfastText
