Monocular Depth Distribution Alignment with Low Computation
Fei Sheng, Feng Xue, Yicong Chang, Wenteng Liang, Anlong Ming

TL;DR
This paper introduces DANet, a lightweight monocular depth estimation model that aligns depth distribution to improve accuracy while using only 1% of the FLOPs of heavier models.
Contribution
The paper proposes a novel distribution alignment network with a pyramid scene transformer and local-global optimization to reduce distribution drift in lightweight depth estimation models.
Findings
Achieves comparable accuracy to heavy-weight models with only 1% FLOPs.
Effectively alleviates distribution drift through distribution shape alignment.
Demonstrates superior performance on NYUDv2 and iBims-1 datasets.
Abstract
The performance of monocular depth estimation generally depends on the amount of parameters and computational cost. It leads to a large accuracy contrast between light-weight networks and heavy-weight networks, which limits their application in the real world. In this paper, we model the majority of accuracy contrast between them as the difference of depth distribution, which we call "Distribution drift". To this end, a distribution alignment network (DANet) is proposed. We firstly design a pyramid scene transformer (PST) module to capture inter-region interaction in multiple scales. By perceiving the difference of depth features between every two regions, DANet tends to predict a reasonable scene structure, which fits the shape of distribution to ground truth. Then, we propose a local-global optimization (LGO) scheme to realize the supervision of global range of scene depth. Thanks to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Optical measurement and interference techniques
MethodsDual Attention Network
