Uncovering the role of semantic and acoustic cues in normal and dichotic listening
Sai Samrat Kankanala, Akshara Soman, Sriram Ganapathy

TL;DR
This study investigates how acoustic and semantic cues contribute to speech comprehension in normal and dichotic listening, using EEG data and a novel deep learning model to quantify their roles.
Contribution
The paper introduces a multimodal deep learning model and a match-mismatch task to quantify acoustic and semantic cue roles in speech perception under different listening conditions.
Findings
Semantic cues improve MM classification in dichotic listening.
Speech perception is fragmented at word boundaries.
The proposed model outperforms previous MM models.
Abstract
Speech comprehension is an involuntary task for the healthy human brain, yet the understanding of the mechanisms underlying this brain functionality remains obscure. In this paper, we aim to quantify the role of acoustic and semantic information streams in complex listening conditions. We propose a paradigm to understand the encoding of the speech cues in electroencephalogram (EEG) data, by designing a match-mismatch (MM) classification task. The MM task involves identifying whether the stimulus (speech) and response (EEG) correspond to each other. We build a multimodal deep-learning based sequence model STEM, which is input with acoustic stimulus (speech envelope), semantic stimulus (textual representations of speech), and the neural response (EEG data). We perform extensive experiments on two separate conditions, i) natural passive listening and, ii) a dichotic listening requiring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing
MethodsSoftmax · Attention Is All You Need
