Artificial Neural Networks Trained on Noisy Speech Exhibit the McGurk Effect
Lukas Grasse, Matthew S. Tata

TL;DR
Artificial neural networks trained on audiovisual speech, even with only congruent data, can exhibit the McGurk effect, and noise during training influences the extent of audiovisual integration, providing insights into perception modeling.
Contribution
This study demonstrates that neural networks trained solely on congruent speech can produce the McGurk effect and explores how noisy training enhances audiovisual integration.
Findings
Networks trained on noisy speech show increased McGurk responses.
Increasing auditory noise during training initially boosts audiovisual integration.
Extreme noise levels impair the development of audiovisual speech integration.
Abstract
Humans are able to fuse information from both auditory and visual modalities to help with understanding speech. This is demonstrated through a phenomenon known as the McGurk Effect, during which a listener is presented with incongruent auditory and visual speech that fuse together into the percept of illusory intermediate phonemes. Building on a recent framework that proposes how to address developmental 'why' questions using artificial neural networks, we evaluated a set of recent artificial neural networks trained on audiovisual speech by testing them with audiovisually incongruent words designed to elicit the McGurk effect. We show that networks trained entirely on congruent audiovisual speech nevertheless exhibit the McGurk percept. We further investigated 'why' by comparing networks trained on clean speech to those trained on noisy speech, and discovered that training with noisy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultisensory perception and integration
MethodsSparse Evolutionary Training
