Simulating Human Audiovisual Search Behavior
Hyunsung Cho, Xuejing Luo, Byungjoo Lee, David Lindlbauer, Antti Oulasvirta

TL;DR
This paper introduces Sensonaut, a computational model that simulates human audiovisual search behavior by balancing effort, time, and accuracy, validated against real human data to inform interface design.
Contribution
It presents a novel embodied, resource-rational model of audiovisual search that integrates perception and action, unlike previous isolated approaches.
Findings
Model reproduces adaptive search effort and time scaling.
Captures characteristic human errors in audiovisual search.
Informs design of interfaces to reduce search cost and cognitive load.
Abstract
Locating a target based on auditory and visual cuessuch as finding a car in a crowded parking lot or identifying a speaker in a virtual meetingrequires balancing effort, time, and accuracy under uncertainty. Existing models of audiovisual search often treat perception and action in isolation, overlooking how people adaptively coordinate movement and sensory strategies. We present Sensonaut, a computational model of embodied audiovisual search. The core assumption is that people deploy their body and sensory systems in ways they believe will most efficiently improve their chances of locating a target, trading off time and effort under perceptual constraints. Our model formulates this as a resource-rational decision-making problem under partial observability. We validate the model against newly collected human data, showing that it reproduces both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultisensory perception and integration · Visual Attention and Saliency Detection · Speech and Audio Processing
