Talking to oneself in CMC: a study of self replies in Wikipedia talk pages
Ludovic Tanguy (CLLE), C\'eline Poudat, Lydia-Mai Ho-Dac (CLLE)

TL;DR
This paper investigates self-replies in Wikipedia talk pages, analyzing their linguistic features, categorizing them, and comparing human and AI annotation performance to understand their role in online discussions.
Contribution
It introduces a typology of self-replies, applies it to English and French samples, and compares human and AI annotation effectiveness.
Findings
Self-replies occur in over 10% of threads with multiple messages.
Humans achieve reasonable annotation efficiency, while LLMs face challenges.
A seven-category typology helps classify self-replies effectively.
Abstract
This study proposes a qualitative analysis of self replies in Wikipedia talk pages, more precisely when the first two messages of a discussion are written by the same user. This specific pattern occurs in more than 10% of threads with two messages or more and can be explained by a number of reasons. After a first examination of the lexical specificities of second messages, we propose a seven categories typology and use it to annotate two reference samples (English and French) of 100 threads each. Finally, we analyse and compare the performance of human annotators (who reach a reasonable global efficiency) and instruction-tuned LLMs (which encounter important difficulties with several categories).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Cancer-related gene regulation · Natural Language Processing Techniques
