Speech Recognition for Analysis of Police Radio Communication

Tejes Srivastava; Ju-Chieh Chou; Priyank Shroff; Karen Livescu,; Christopher Graziul

arXiv:2409.10858·cs.SD·September 18, 2024

Speech Recognition for Analysis of Police Radio Communication

Tejes Srivastava, Ju-Chieh Chou, Priyank Shroff, Karen Livescu,, Christopher Graziul

PDF

Open Access

TL;DR

This study evaluates the feasibility of automatic speech recognition for police radio communications, highlighting challenges and improvements with fine-tuned models, and provides a new corpus for future research.

Contribution

The paper introduces a large corpus of police radio communications and assesses the performance of various speech recognition models on this domain.

Findings

01

Off-the-shelf models perform poorly on police radio data.

02

Fine-tuned models approach human transcription accuracy.

03

Challenges remain in transcribing short utterances and detecting miscommunications.

Abstract

Police departments around the world use two-way radio for coordination. These broadcast police communications (BPC) are a unique source of information about everyday police activity and emergency response. Yet BPC are not transcribed, and their naturalistic audio properties make automatic transcription challenging. We collect a corpus of roughly 62,000 manually transcribed radio transmissions (~46 hours of audio) to evaluate the feasibility of automatic speech recognition (ASR) using modern recognition models. We evaluate the performance of off-the-shelf speech recognizers, models fine-tuned on BPC data, and customized end-to-end models. We find that both human and machine transcription is challenging in this domain. Large off-the-shelf ASR models perform poorly, but fine-tuned models can reach the approximate range of human performance. Our work suggests directions for future work,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis