Integrate Document Ranking Information into Confidence Measure Calculation for Spoken Term Detection
Quan Liu, Wu Guo, Zhen-Hua Ling

TL;DR
This paper introduces a novel algorithm that enhances confidence measure calculation in spoken term detection by integrating document relevance information, leading to improved accuracy across multiple languages and robustness with limited resources.
Contribution
The paper presents a new method that combines document ranking with confidence measures, improving STD performance especially in resource-limited language scenarios.
Findings
Consistent improvement over state-of-the-art methods
Effective across Tamil, Vietnamese, and English
Robust even with less accurate speech recognizers
Abstract
This paper proposes an algorithm to improve the calculation of confidence measure for spoken term detection (STD). Given an input query term, the algorithm first calculates a measurement named document ranking weight for each document in the speech database to reflect its relevance with the query term by summing all the confidence measures of the hypothesized term occurrences in this document. The confidence measure of each term occurrence is then re-estimated through linear interpolation with the calculated document ranking weight to improve its reliability by integrating document-level information. Experiments are conducted on three standard STD tasks for Tamil, Vietnamese and English respectively. The experimental results all demonstrate that the proposed algorithm achieves consistent improvements over the state-of-the-art method for confidence measure calculation. Furthermore, this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
