QualiaNet: An Experience-Before-Inference Network
Paul Linton

TL;DR
QualiaNet models human 3D vision by separating stereo experience from inference, using disparity gradients to estimate scene distance through a two-stage neural network architecture.
Contribution
This work introduces a novel two-stage neural network architecture inspired by human stereo vision, leveraging disparity gradients for scene distance estimation.
Findings
QualiaNet can recover scene distance from disparity gradients alone.
The two-stage architecture validates the influence of stereo experience on inference.
Disparity maps effectively simulate human stereo experience for distance estimation.
Abstract
Human 3D vision involves two distinct stages: an Experience Module, where stereo depth is extracted relative to fixation, and an Inference Module, where this experience is interpreted to estimate 3D scene properties. Paradoxically, although our experience of stereo vision does not provide us with distance information, it does affect our inferences about visual scale. We propose the Inference Module exploits a natural scene statistic: near scenes produce vivid disparity gradients, while far scenes appear comparatively flat. QualiaNet implements this two-stage architecture computationally: disparity maps simulating human stereo experience are passed to a CNN trained to estimate distance. The network can recover distance from disparity gradients alone, validating this approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
