TL;DR
This paper introduces Personal-ITY, a new Italian YouTube comment corpus for personality prediction using MBTI labels, enabling diverse experiments and highlighting challenges in predicting certain personality types.
Contribution
The paper presents a novel Italian personality corpus built via Distant Supervision, expanding resources for personality prediction and providing baseline experimental results.
Findings
Some MBTI types are easier to predict than others
Cross-dataset prediction shows potential benefits
The corpus enables diverse experimental opportunities
Abstract
We present a novel corpus for personality prediction in Italian, containing a larger number of authors and a different genre compared to previously available resources. The corpus is built exploiting Distant Supervision, assigning Myers-Briggs Type Indicator (MBTI) labels to YouTube comments, and can lend itself to a variety of experiments. We report on preliminary experiments on Personal-ITY, which can serve as a baseline for future work, showing that some types are easier to predict than others, and discussing the perks of cross-dataset prediction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
