TL;DR
This paper evaluates direct dependency parsing of speech signals versus transcriptions, demonstrating that graph-based methods outperform sequence labeling and that direct speech parsing surpasses pipeline approaches despite fewer parameters.
Contribution
It provides a comparative assessment of graph-based and sequence labeling parsing paradigms directly on speech, highlighting the advantages of direct speech parsing over pipeline methods.
Findings
Graph-based parsing outperforms sequence labeling.
Direct speech parsing outperforms pipeline approaches.
Fewer parameters in direct speech parsing achieve better results.
Abstract
Direct dependency parsing of the speech signal -- as opposed to parsing speech transcriptions -- has recently been proposed as a task (Pupier et al. 2022), as a way of incorporating prosodic information in the parsing system and bypassing the limitations of a pipeline approach that would consist of using first an Automatic Speech Recognition (ASR) system and then a syntactic parser. In this article, we report on a set of experiments aiming at assessing the performance of two parsing paradigms (graph-based parsing and sequence labeling based parsing) on speech parsing. We perform this evaluation on a large treebank of spoken French, featuring realistic spontaneous conversations. Our findings show that (i) the graph based approach obtain better results across the board (ii) parsing directly from speech outperforms a pipeline approach, despite having 30% fewer parameters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
MethodsSparse Evolutionary Training
