TxPI-u: A Resource for Personality Identification of Undergraduates
Gabriela Ram\'irez-de-la-Rosa, Esa\'u Villatoro-Tello, H\'ector, Jim\'enez-Salazar

TL;DR
This paper introduces TxPI-u, a new Spanish-language corpus of 416 Mexican undergraduates with demographic data, designed to facilitate automatic personality trait identification using NLP models.
Contribution
The paper presents TxPI-u, a novel Spanish corpus for personality identification, along with baseline models for future research in this area.
Findings
Provides a new Spanish-language dataset for personality analysis
Includes demographic information to enhance model development
Establishes baseline results for future comparative studies
Abstract
Resources such as labeled corpora are necessary to train automatic models within the natural language processing (NLP) field. Historically, a large number of resources regarding a broad number of problems are available mostly in English. One of such problems is known as Personality Identification where based on a psychological model (e.g. The Big Five Model), the goal is to find the traits of a subject's personality given, for instance, a text written by the same subject. In this paper we introduce a new corpus in Spanish called Texts for Personality Identification (TxPI). This corpus will help to develop models to automatically assign a personality trait to an author of a text document. Our corpus, TxPI-u, contains information of 416 Mexican undergraduate students with some demographics information such as, age, gender, and the academic program they are enrolled. Finally, as an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
