THIRDEYE: Cue-Aware Monocular Depth Estimation via Brain-Inspired Multi-Stage Fusion
Calin Teodor Ioan

TL;DR
ThirdEye introduces a brain-inspired, cue-aware monocular depth estimation pipeline that explicitly incorporates human visual cues through specialized networks, improving depth prediction by mimicking cortical processing.
Contribution
It presents a novel multi-stage fusion approach using pre-trained cue networks and a cortical hierarchy inspired by neuroscience for monocular depth estimation.
Findings
Effective fusion of multiple monocular cues improves depth accuracy.
The model leverages frozen cue networks to incorporate external supervision.
The approach demonstrates potential for more interpretable and biologically plausible depth estimation.
Abstract
Monocular depth estimation methods traditionally train deep models to infer depth directly from RGB pixels. This implicit learning often overlooks explicit monocular cues that the human visual system relies on, such as occlusion boundaries, shading, and perspective. Rather than expecting a network to discover these cues unaided, we present ThirdEye, a cue-aware pipeline that deliberately supplies each cue through specialised, pre-trained, and frozen networks. These cues are fused in a three-stage cortical hierarchy (V1->V2->V3) equipped with a key-value working-memory module that weights them by reliability. An adaptive-bins transformer head then produces a high-resolution disparity map. Because the cue experts are frozen, ThirdEye inherits large amounts of external supervision while requiring only modest fine-tuning. This extended version provides additional architectural detail,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · Advanced Vision and Imaging · Industrial Vision Systems and Defect Detection
