FA-Depth: Toward Fast and Accurate Self-supervised Monocular Depth Estimation
Fei Wang, Jun Cheng

TL;DR
FA-Depth introduces a fast, lightweight self-supervised monocular depth estimation method that balances high accuracy with real-time inference speed, suitable for deployment in resource-constrained environments.
Contribution
The paper proposes a novel SmallDepth model with an equivalent transformation module, pyramid loss, and function approximation loss to enhance depth estimation accuracy without increasing inference complexity.
Findings
Achieves state-of-the-art accuracy on KITTI dataset.
Runs at over 500 frames per second with approximately 2 million parameters.
Improves robustness to lighting and directional changes.
Abstract
Most existing methods often rely on complex models to predict scene depth with high accuracy, resulting in slow inference that is not conducive to deployment. To better balance precision and speed, we first designed SmallDepth based on sparsity. Second, to enhance the feature representation ability of SmallDepth during training under the condition of equal complexity during inference, we propose an equivalent transformation module(ETM). Third, to improve the ability of each layer in the case of a fixed SmallDepth to perceive different context information and improve the robustness of SmallDepth to the left-right direction and illumination changes, we propose pyramid loss. Fourth, to further improve the accuracy of SmallDepth, we utilized the proposed function approximation loss (APX) to transfer knowledge in the pretrained HQDecv2, obtained by optimizing the previous HQDec to address…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptical measurement and interference techniques · Image Processing Techniques and Applications · Advanced Vision and Imaging
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
