Can LLMs Help Localize Fake Words in Partially Fake Speech?
Lin Zhang, Thomas Thebaud, Zexin Cai, Sanjeev Khudanpur, Daniel Povey, Leibny Paola Garc\'ia-Perera, Matthew Wiesner, Nicholas Andrews

TL;DR
This paper explores whether large language models trained on text can effectively identify fake words in partially manipulated speech, highlighting their reliance on editing patterns and the challenge of generalization.
Contribution
It demonstrates that speech LLMs can localize fake words by leveraging learned editing patterns, revealing both their potential and limitations.
Findings
Model uses editing-style cues for fake word localization.
Performance depends on learned in-domain editing patterns.
Generalization to unseen editing styles remains challenging.
Abstract
Large language models (LLMs), trained on large-scale text, have recently attracted significant attention for their strong performance across many tasks. Motivated by this, we investigate whether a text-trained LLM can help localize fake words in partially fake speech, where only specific words within a speech are edited. We build a speech LLM to perform fake word localization via next token prediction. Experiments and analyses on AV-Deepfake1M and PartialEdit indicates that the model frequently leverages editing-style pattern learned from the training data, particularly word-level polarity substitutions for those two databases we discussed, as cues for localizing fake words. Although such particular patterns provide useful information in an in-domain scenario, how to avoid over-reliance on such particular pattern and improve generalization to unseen editing styles remains an open…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · Text Readability and Simplification
