Inference of Personal Attributes from Tweets Using Machine Learning
Take Yo, Kazutoshi Sasahara

TL;DR
This paper explores predicting personal attributes like gender, occupation, and age from tweets using machine learning, demonstrating that models can achieve 60-70% accuracy with optimized vectorization techniques.
Contribution
It introduces a method combining word2vec and deep learning to predict personal attributes from tweet text, analyzing the impact of vector dimensions and block sizes.
Findings
Prediction accuracy reached 60-70%.
Word2vec effectively represented tweet content.
Model performance depended on vector dimension and block size.
Abstract
Using machine learning algorithms, including deep learning, we studied the prediction of personal attributes from the text of tweets, such as gender, occupation, and age groups. We applied word2vec to construct word vectors, which were then used to vectorize tweet blocks. The resulting tweet vectors were used as inputs for training models, and the prediction accuracy of those models was examined as a function of the dimension of the tweet vectors and the size of the tweet blacks. The results showed that the machine learning algorithms could predict the three personal attributes of interest with 60-70% accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Mental Health via Writing · Authorship Attribution and Profiling
