Monocular Depth Estimation via Neural Network with Learnable Algebraic Group and Ring Structures
Qianlei Wang, Kexun Chen, Shaolin Zhang, Hongli Gao, Chaoning Zhang, Xiaolin Qin

TL;DR
LAGRNet introduces an algebraic geometry grounded framework for monocular depth estimation, embedding learnable group, ring, and sheaf structures to improve accuracy and generalization.
Contribution
It is the first to incorporate algebraic geometric structures into deep learning for monocular depth estimation, enhancing robustness and cross-scale consistency.
Findings
Outperforms state-of-the-art methods on KITTI, NYU-Depth V2, ETH3D benchmarks.
Achieves significant accuracy improvements in zero-shot evaluations.
Demonstrates robustness to view changes and cross-scale variations.
Abstract
Monocular depth estimation (MDE) has witnessed remarkable progress driven by Convolutional Neural Networks and transformer-based architectures. However, these approaches typically treat the problem as a generic image-to-image regression on Euclidean grids, thereby overlooking the intrinsic algebraic and geometric structures induced by perspective projection. To address this limitation, we propose LAGRNet, a novel framework that fundamentally grounds MDE in algebraic geometry by explicitly embedding learnable group, ring, and sheaf structures into the deep learning pipeline. Modeling feature maps as sections of a sheaf over an approximated image manifold, our method first establishes a Group-defined Feature Manifold (GFM) parameterized by a learned algebraic group action to enforce projective equivariance and robustness against view changes. To facilitate algebraically consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
