Prosocial Behavior Detection in Player Game Chat: From Aligning Human-AI Definitions to Efficient Annotation at Scale

Rafal Kocielnik; Min Kim; Penphob (Andrea) Boonyarungsrit; Fereshteh Soltani; Deshawn Sambrano; Animashree Anandkumar; R. Michael Alvarez

arXiv:2508.05938·cs.CL·August 11, 2025

Prosocial Behavior Detection in Player Game Chat: From Aligning Human-AI Definitions to Efficient Annotation at Scale

Rafal Kocielnik, Min Kim, Penphob (Andrea) Boonyarungsrit, Fereshteh Soltani, Deshawn Sambrano, Animashree Anandkumar, R. Michael Alvarez

PDF

Open Access

TL;DR

This paper introduces a scalable pipeline for detecting prosocial behavior in chat, combining human-AI collaboration, refined task definitions, and cost-effective inference to achieve high precision with reduced costs.

Contribution

It presents a novel three-stage pipeline that leverages LLMs and human refinement to efficiently annotate and classify prosocial content at scale.

Findings

01

Achieved approximately 90% precision in prosocial content detection.

02

Reduced inference costs by around 70% using a two-stage classification system.

03

Developed a human-AI refinement process to improve task clarity and label quality.

Abstract

Detecting prosociality in text--communication intended to affirm, support, or improve others' behavior--is a novel and increasingly important challenge for trust and safety systems. Unlike toxic content detection, prosociality lacks well-established definitions and labeled data, requiring new approaches to both annotation and deployment. We present a practical, three-stage pipeline that enables scalable, high-precision prosocial content classification while minimizing human labeling effort and inference costs. First, we identify the best LLM-based labeling strategy using a small seed set of human-labeled examples. We then introduce a human-AI refinement loop, where annotators review high-disagreement cases between GPT-4 and humans to iteratively clarify and expand the task definition-a critical step for emerging annotation tasks like prosociality. This process results in improved label…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining