Real-time Monocular Depth Estimation with Sparse Supervision on Mobile
Mehmet Kerim Yucel, Valia Dimaridou, Anastasios Drosou, Albert, Sa\`a-Garriga

TL;DR
This paper presents a highly efficient monocular depth estimation model optimized for mobile devices, achieving competitive accuracy with minimal complexity through systematic design choices, knowledge distillation, and pruning.
Contribution
It demonstrates how to improve depth estimation models for mobile use without increasing complexity, using ablation studies and model optimization techniques.
Findings
Achieves 0.1156 WHDR on DIW with 2.6M parameters
Reaches 37 FPS on mobile GPU without hardware-specific optimization
Pruned model reaches 44 FPS with 1M parameters
Abstract
Monocular (relative or metric) depth estimation is a critical task for various applications, such as autonomous vehicles, augmented reality and image editing. In recent years, with the increasing availability of mobile devices, accurate and mobile-friendly depth models have gained importance. Increasingly accurate models typically require more computational resources, which inhibits the use of such models on mobile devices. The mobile use case is arguably the most unrestricted one, which requires highly accurate yet mobile-friendly architectures. Therefore, we try to answer the following question: How can we improve a model without adding further complexity (i.e. parameters)? Towards this end, we systematically explore the design space of a relative depth estimation model from various dimensions and we show, with key design choices and ablation studies, even an existing architecture can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
