Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio
Jeffrey Tumminia, Amanda Kuznecov, Sophia Tsilerides, Ilana Weinstein,, Brian McFee, Michael Picheny, Aaron R. Kaufman

TL;DR
This paper introduces a novel workflow for diarizing judicial speech in court recordings by leveraging speaker verification with speech embeddings, achieving a 13.8% error rate on US Supreme Court oral arguments.
Contribution
It presents a reference-dependent speaker verification method using speech embeddings and a reference audio library for diarizing judicial proceedings, a new approach in legal audio analysis.
Findings
Achieved 13.8% diarization error rate on US Supreme Court audio data.
Developed a workflow for speaker diarization in legal proceedings using speech embeddings.
Provided a publicly available code repository for the proposed method.
Abstract
United States Courts make audio recordings of oral arguments available as public record, but these recordings rarely include speaker annotations. This paper addresses the Speech Audio Diarization problem, answering the question of "Who spoke when?" in the domain of judicial oral argument proceedings. We present a workflow for diarizing the speech of judges using audio recordings of oral arguments, a process we call Reference-Dependent Speaker Verification. We utilize a speech embedding network trained with the Generalized End-to-End Loss to encode speech into d-vectors and a pre-defined reference audio library based on annotated data. We find that by encoding reference audio for speakers and full arguments and computing similarity scores we achieve a 13.8% Diarization Error Rate for speakers covered by the reference audio library on a held-out test set. We evaluate our method on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Law, logistics, and international trade · Law in Society and Culture
