Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation
G. Tur, D. Hakkani-Tur, A. Stolcke, E. Shriberg

TL;DR
This paper introduces a probabilistic model that combines prosodic and lexical cues for automatic speech segmentation, demonstrating improved accuracy over individual methods on broadcast news data.
Contribution
It proposes two novel methods for integrating prosodic and lexical information using hidden Markov models and decision trees, enhancing segmentation performance.
Findings
Prosodic model alone is competitive with word-based methods.
Combining prosodic and lexical cues significantly reduces segmentation errors.
Approach evaluated on Broadcast News corpus with DARPA-TDT metrics.
Abstract
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov models and decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech waveforms. We evaluate our approach on the Broadcast News corpus, using the DARPA-TDT evaluation metrics. Results show that the prosodic model alone is competitive with word-based segmentation methods. Furthermore, we achieve a significant reduction in error by combining the prosodic and word-based knowledge sources.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Topic Modeling
