Debiasing Central Fixation Confounds Reveals a Peripheral "Sweet Spot" for Human-like Scanpaths in Hard-Attention Vision
Pengcheng Pan, Yonekura Shogo, Yasuo Kuniyosh

TL;DR
This paper reveals how center bias inflates scanpath similarity metrics and introduces GCS, a debiased metric that uncovers a peripheral 'sweet spot' for human-like scanpaths in hard-attention vision models.
Contribution
The study identifies the confounding effect of center bias on scanpath evaluation and proposes GCS, a new metric that better isolates genuine behavioral similarity in gaze patterns.
Findings
Center bias can cause trivial baselines to score highly on scanpath metrics.
A peripheral 'sweet spot' exists where models produce human-like scanpaths after debiasing.
GCS effectively reveals this sweet spot, improving evaluation of gaze behavior.
Abstract
Human eye movements in visual recognition reflect a balance between foveal sampling and peripheral context. Task-driven hard-attention models for vision are often evaluated by how well their scanpaths match human gaze. However, common scanpath metrics can be strongly confounded by dataset-specific center bias, especially on object-centric datasets. Using Gaze-CIFAR-10, we show that a trivial center-fixation baseline achieves surprisingly strong scanpath scores, approaching many learned policies. This makes standard metrics optimistic and blurs the distinction between genuine behavioral alignment and mere central tendency. We then analyze a hard-attention classifier under constrained vision by sweeping foveal patch size and peripheral context, revealing a peripheral sweet spot: only a narrow range of sensory constraints yields scanpaths that are simultaneously (i) above the center…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · Face Recognition and Perception
