End-to-end named entity extraction from speech
Sahar Ghannay, Antoine Caubri\`ere, Yannick Est\`eve, Antoine, Laurent, Emmanuel Morin

TL;DR
This paper introduces an end-to-end neural approach for extracting named entities directly from speech, outperforming traditional pipeline methods by jointly optimizing ASR and NER tasks.
Contribution
It presents the first end-to-end model for speech-based NER, integrating ASR and NER into a single neural architecture for improved accuracy.
Findings
End-to-end approach achieves F-measure=0.69
Outperforms pipeline approach with F-measure=0.65
Demonstrates benefits of joint optimization in speech NER
Abstract
Named entity recognition (NER) is among SLU tasks that usually extract semantic information from textual documents. Until now, NER from speech is made through a pipeline process that consists in processing first an automatic speech recognition (ASR) on the audio and then processing a NER on the ASR outputs. Such approach has some disadvantages (error propagation, metric to tune ASR systems sub-optimal in regards to the final task, reduced space search at the ASR output level...) and it is known that more integrated approaches outperform sequential ones, when they can be applied. In this paper, we present a first study of end-to-end approach that directly extracts named entities from speech, though a unique neural architecture. On a such way, a joint optimization is able for both ASR and NER. Experiments are carried on French data easily accessible, composed of data distributed in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
