Artificial Rigidities vs. Biological Noise: A Comparative Analysis of Multisensory Integration in AV-HuBERT and Human Observers

Francisco Portillo L\'opez

arXiv:2601.15869·cs.CL·January 23, 2026

Artificial Rigidities vs. Biological Noise: A Comparative Analysis of Multisensory Integration in AV-HuBERT and Human Observers

Francisco Portillo L\'opez

PDF

Open Access

TL;DR

This paper compares AV-HuBERT's multisensory speech perception to humans, showing it mimics some biological responses but lacks neural variability, highlighting strengths and limitations of current AI models.

Contribution

It provides a detailed benchmark of AV-HuBERT's responses to audiovisual incongruences, revealing both similarities and differences with human perception.

Findings

01

AI and humans have nearly identical auditory dominance rates.

02

AV-HuBERT exhibits a deterministic phonetic fusion bias.

03

Humans show perceptual stochasticity and diverse error profiles.

Abstract

This study evaluates AV-HuBERT's perceptual bio-fidelity by benchmarking its response to incongruent audiovisual stimuli (McGurk effect) against human observers (N=44). Results reveal a striking quantitative isomorphism: AI and humans exhibited nearly identical auditory dominance rates (32.0% vs. 31.8%), suggesting the model captures biological thresholds for auditory resistance. However, AV-HuBERT showed a deterministic bias toward phonetic fusion (68.0%), significantly exceeding human rates (47.7%). While humans displayed perceptual stochasticity and diverse error profiles, the model remained strictly categorical. Findings suggest that current self-supervised architectures mimic multisensory outcomes but lack the neural variability inherent to human speech perception.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultisensory perception and integration · Tactile and Sensory Interactions · Neuroscience and Music Perception