Characterizing Language Use in a Collaborative Situated Game
Nicholas Tomlin, Naitian Zhou, Eve Fleisig, Liangyuan Chen, T\'ea Wright, Lauren Vinh, Laura X. Ma, Seun Eisape, Ellie French, Tingting Du, Tianjiao Zhang, Alexander Koller, Alane Suhr

TL;DR
This paper introduces the Portal Dialogue Corpus, a large dataset of spoken dialogue from a cooperative video game, revealing unique linguistic phenomena in collaborative, situated problem-solving scenarios.
Contribution
It provides a new, publicly available corpus of in-game dialogue with detailed annotations, enabling future research on language in complex collaborative environments.
Findings
Identification of complex spatial references in dialogue
Analysis of clarification and repair behaviors
Discovery of ad-hoc convention formation in game communication
Abstract
Cooperative video games, where multiple participants must coordinate by communicating and reasoning under uncertainty in complex environments, yield a rich source of language data. We collect the Portal Dialogue Corpus: a corpus of 11.5 hours of spoken human dialogue in the co-op mode of the popular Portal 2 virtual puzzle game, comprising 24.5K total utterances. We analyze player language and behavior, identifying a number of linguistic phenomena that rarely appear in most existing chitchat or task-oriented dialogue corpora, including complex spatial reference, clarification and repair, and ad-hoc convention formation. To support future analyses of language use in complex, situated, collaborative problem-solving scenarios, we publicly release the corpus, which comprises player videos, audio, transcripts, game state data, and both manual and automatic annotations of language data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Multimodal Machine Learning Applications · Language and cultural evolution
