Affect, Body, Cognition, Demographics, and Emotion: The ABCDE of Text Features for Computational Affective Science

Jan Philip Wahle; Krishnapriya Vishnubhotla; Bela Gipp; Saif M. Mohammad

arXiv:2512.17752·cs.CL·April 3, 2026

Affect, Body, Cognition, Demographics, and Emotion: The ABCDE of Text Features for Computational Affective Science

Jan Philip Wahle, Krishnapriya Vishnubhotla, Bela Gipp, Saif M. Mohammad

PDF

1 Datasets

TL;DR

The paper introduces the ABCDE dataset, a large-scale, annotated collection of over 400 million text utterances from diverse sources, designed to support interdisciplinary research in affective and social sciences.

Contribution

It provides a comprehensive, annotated text dataset that addresses accessibility issues and enables research across multiple scientific disciplines.

Findings

01

Contains over 400 million text utterances from various sources

02

Annotated with a wide range of affective and social features

03

Facilitates interdisciplinary research in multiple fields

Abstract

Work in Computational Affective Science and Computational Social Science explores a wide variety of research questions about people, emotions, behavior, and health. Such work often relies on language data that is first labeled with relevant information, such as the use of emotion words or the age of the speaker. Although many resources and algorithms exist to enable this type of labeling, discovering, accessing, and using them remains a substantial impediment, particularly for practitioners outside of computer science. Here, we present the ABCDE dataset (Affect, Body, Cognition, Demographics, and Emotion), a large-scale collection of over 400 million text utterances drawn from social media, blogs, books, and AI-generated sources. The dataset is annotated with a wide range of features relevant to computational affective and social science. ABCDE facilitates interdisciplinary research…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

jpwahle/abcde
dataset· 864 dl
864 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.