A Recorded Debating Dataset

Shachar Mirkin; Michal Jacovi; Tamar Lavee; Hong-Kwang Kuo; Samuel; Thomas; Leslie Sager; Lili Kotlerman; Elad Venezian; Noam Slonim

arXiv:1709.06438·cs.CL·March 28, 2018·1 cites

A Recorded Debating Dataset

Shachar Mirkin, Michal Jacovi, Tamar Lavee, Hong-Kwang Kuo, Samuel, Thomas, Leslie Sager, Lili Kotlerman, Elad Venezian, Noam Slonim

PDF

Open Access

TL;DR

This paper introduces a comprehensive English debating speech dataset, including audio, automatic, and manual transcriptions, to support research in computational argumentation and debating technologies.

Contribution

It provides a new, multi-stage debating speech dataset with both automatic and manual transcriptions, facilitating diverse research applications.

Findings

01

Dataset includes 60 debates on controversial topics.

02

Multiple transcript formats support various NLP tasks.

03

Resource aims to enhance debate-specific speech and argumentation research.

Abstract

This paper describes an English audio and textual dataset of debating speeches, a unique resource for the growing research field of computational argumentation and debating technologies. We detail the process of speech recording by professional debaters, the transcription of the speeches with an Automatic Speech Recognition (ASR) system, their consequent automatic processing to produce a text that is more "NLP-friendly", and in parallel -- the manual transcription of the speeches in order to produce gold-standard "reference" transcripts. We release 60 speeches on various controversial topics, each in five formats corresponding to the different stages in the production of the data. The intention is to allow utilizing this resource for multiple research purposes, be it the addition of in-domain training data for a debate-specific ASR system, or applying argumentation mining on either…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Natural Language Processing Techniques · Topic Modeling