Gender Prediction from Tweets: Improving Neural Representations with Hand-Crafted Features
Erhan Sezerer, Ozan Polatbilek, Selma Tekir

TL;DR
This paper introduces an RNN with Attention model for gender prediction from tweets, enhanced with hand-crafted n-gram features, achieving state-of-the-art results in multiple languages.
Contribution
It combines neural attention mechanisms with traditional n-gram features to improve gender prediction accuracy from Twitter data.
Findings
State-of-the-art performance on English gender prediction.
Competitive results on Spanish and Arabic datasets.
Enhanced model outperforms previous approaches.
Abstract
Author profiling is the characterization of an author through some key attributes such as gender, age, and language. In this paper, a RNN model with Attention (RNNwA) is proposed to predict the gender of a twitter user using their tweets. Both word level and tweet level attentions are utilized to learn 'where to look'. This model (https://github.com/Darg-Iztech/gender-prediction-from-tweets) is improved by concatenating LSA-reduced n-gram features with the learned neural representation of a user. Both models are tested on three languages: English, Spanish, Arabic. The improved version of the proposed model (RNNwA + n-gram) achieves state-of-the-art performance on English and has competitive results on Spanish and Arabic.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Spam and Phishing Detection
