Whisper: Courtside Edition Enhancing ASR Performance Through LLM-Driven Context Generation

Yonathan Ron; Shiri Gilboa; Tammuz Dubnov

arXiv:2602.18966·cs.CL·February 24, 2026

Whisper: Courtside Edition Enhancing ASR Performance Through LLM-Driven Context Generation

Yonathan Ron, Shiri Gilboa, Tammuz Dubnov

PDF

Open Access

TL;DR

Whisper: Courtside Edition employs a multi-agent LLM pipeline to improve domain-specific ASR accuracy by generating context-aware prompts, significantly reducing word error rates without retraining the base model.

Contribution

The paper introduces a novel LLM-driven prompt augmentation pipeline that enhances Whisper's ASR performance in specialized domains without retraining.

Findings

01

Achieved a 17.0% relative reduction in WER on NBA commentary data.

02

Outperformed direct transcript post-editing in 40.1% of segments.

03

Degradation occurred in only 7.1% of segments.

Abstract

Domain-specific speech remains a persistent challenge for automatic speech recognition (ASR), even for state-of-the-art systems like OpenAI's Whisper. We introduce Whisper: Courtside Edition, a novel multi-agent large language model (LLM) pipeline that enhances Whisper transcriptions without retraining. The pipeline intercepts Whisper's initial transcript, applies specialized LLM agents for domain context identification, named entity recognition, and jargon detection, and generates compact prompts that guide Whisper's decoder. Evaluated on 421 NBA basketball commentary segments (a domain characterized by dense proper nouns and technical terminology) our best pipeline achieves a statistically significant 17.0% relative reduction in word error rate (WER; from 0.217 to 0.180, p<0.001). Improvements are observed in 40.1% of segments with degradation in only 7.1%, substantially outperforming…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Speech and dialogue systems