Identifying Speakers in Dialogue Transcripts: A Text-based Approach   Using Pretrained Language Models

Minh Nguyen; Franck Dernoncourt; Seunghyun Yoon; Hanieh Deilamsalehy,; Hao Tan; Ryan Rossi; Quan Hung Tran; Trung Bui; Thien Huu Nguyen

arXiv:2407.12094·cs.CL·July 18, 2024

Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models

Minh Nguyen, Franck Dernoncourt, Seunghyun Yoon, Hanieh Deilamsalehy,, Hao Tan, Ryan Rossi, Quan Hung Tran, Trung Bui, Thien Huu Nguyen

PDF

Open Access 1 Repo

TL;DR

This paper presents a new large-scale dataset and transformer-based models for text-based speaker identification in dialogue transcripts, significantly improving accuracy and setting new benchmarks.

Contribution

It introduces a novel large-scale dataset from MediaSum and develops transformer models tailored for speaker identification, addressing gaps in existing research.

Findings

01

Achieved 80.3% precision in speaker identification

02

Developed models leveraging contextual dialogue cues

03

Provided publicly available dataset and code

Abstract

We introduce an approach to identifying speaker names in dialogue transcripts, a crucial task for enhancing content accessibility and searchability in digital media archives. Despite the advancements in speech recognition, the task of text-based speaker identification (SpeakerID) has received limited attention, lacking large-scale, diverse datasets for effective model training. Addressing these gaps, we present a novel, large-scale dataset derived from the MediaSum corpus, encompassing transcripts from a wide range of media sources. We propose novel transformer-based models tailored for SpeakerID, leveraging contextual cues within dialogues to accurately attribute speaker names. Through extensive experiments, our best model achieves a great precision of 80.3\%, setting a new benchmark for SpeakerID. The data and code are publicly available here:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adobe-research/speaker-identification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Natural Language Processing Techniques