The Teacher-Student Chatroom Corpus
Andrew Caines, Helen Yannakoudakis, Helena Edmondson, Helen, Allen, Pascual P\'erez-Paredes, Bill Byrne, Paula Buttery

TL;DR
The Teacher-Student Chatroom Corpus (TSCC) is a publicly available collection of online chat conversations from one-to-one English lessons, useful for linguistic and educational research.
Contribution
This paper introduces the TSCC, a new corpus of online teacher-student chat conversations, detailing its design, data collection, annotations, and potential research applications.
Findings
Contains 13.5K conversational turns and 133K words.
Includes detailed annotations and descriptive analyses.
Facilitates research on informal online language use.
Abstract
The Teacher-Student Chatroom Corpus (TSCC) is a collection of written conversations captured during one-to-one lessons between teachers and learners of English. The lessons took place in an online chatroom and therefore involve more interactive, immediate and informal language than might be found in asynchronous exchanges such as email correspondence. The fact that the lessons were one-to-one means that the teacher was able to focus exclusively on the linguistic abilities and errors of the student, and to offer personalised exercises, scaffolding and correction. The TSCC contains more than one hundred lessons between two teachers and eight students, amounting to 13.5K conversational turns and 133K words: it is freely available for research use. We describe the corpus design, data collection procedure and annotations added to the text. We perform some preliminary descriptive analyses of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
