Beyond Prompting: Efficient and Robust Contextual Biasing for Speech LLMs via Logit-Space Integration (LOGIC)

Peidong Wang

arXiv:2601.15397·cs.AI·February 6, 2026

Beyond Prompting: Efficient and Robust Contextual Biasing for Speech LLMs via Logit-Space Integration (LOGIC)

Peidong Wang

PDF

Open Access

TL;DR

This paper introduces LOGIC, a novel logit-space integration method for speech LLMs that efficiently and robustly incorporates new entities during decoding, outperforming prompting and post-processing approaches in multilingual settings.

Contribution

LOGIC provides a scalable, decoding-layer approach to bias speech LLMs with new entities, overcoming prompt limitations and reducing errors without increasing inference time.

Findings

01

9% relative reduction in Entity WER

02

Negligible 0.30% increase in False Alarm Rate

03

Effective across 11 multilingual locales

Abstract

The rapid emergence of new entities -- driven by cultural shifts, evolving trends, and personalized user data -- poses a significant challenge for existing Speech Large Language Models (Speech LLMs). While these models excel at general conversational tasks, their static training knowledge limits their ability to recognize domain-specific terms such as contact names, playlists, or technical jargon. Existing solutions primarily rely on prompting, which suffers from poor scalability: as the entity list grows, prompting encounters context window limitations, increased inference latency, and the "lost-in-the-middle" phenomenon. An alternative approach, Generative Error Correction (GEC), attempts to rewrite transcripts via post-processing but frequently suffers from "over-correction", introducing hallucinations of entities that were never spoken. In this work, we introduce LOGIC…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques