Through a Gender Lens: Learning Usage Patterns of Emojis from Large-Scale Android Users
Zhenpeng Chen, Xuan Lu, Wei Ai, Huoran Li, Qiaozhu Mei and, Xuanzhe Liu

TL;DR
This study analyzes gender-specific emoji usage patterns from a large global dataset, demonstrating that emoji behavior can accurately predict user gender and offers privacy advantages over text-based models.
Contribution
It introduces a large-scale analysis of gender differences in emoji usage and shows that emoji-based models can effectively infer gender while preserving user privacy.
Findings
Gender differences in emoji usage are statistically significant.
Emoji-based models can accurately predict user gender.
Emoji usage offers privacy advantages over text-based models.
Abstract
Based on a large data set of emoji using behavior collected from smartphone users over the world, this paper investigates gender-specific usage of emojis. We present various interesting findings that evidence a considerable difference in emoji usage by female and male users. Such a difference is significant not just in a statistical sense; it is sufficient for a machine learning algorithm to accurately infer the gender of a user purely based on the emojis used in their messages. In real world scenarios where gender inference is a necessity, models based on emojis have unique advantages over existing models that are based on textual or contextual information. Emojis not only provide language-independent indicators, but also alleviate the risk of leaking private user information through the analysis of text and metadata.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Communication and Language · Hate Speech and Cyberbullying Detection · Authorship Attribution and Profiling
