LMDepth: Lightweight Mamba-based Monocular Depth Estimation for Real-World Deployment
Jiahuan Long, Xin Zhou

TL;DR
LMDepth is a lightweight, efficient monocular depth estimation network that balances high accuracy with low computational cost, suitable for deployment on resource-constrained devices.
Contribution
The paper introduces LMDepth, a novel Mamba-based architecture with a pyramid spatial pooling module and depth Mamba blocks, offering a lightweight alternative to Transformer-based methods.
Findings
Outperforms previous lightweight methods on NYUDv2 and KITTI datasets.
Achieves higher accuracy with fewer parameters and GFLOPs.
Successfully deployed on embedded platform with INT8 quantization.
Abstract
Monocular depth estimation provides an additional depth dimension to RGB images, making it widely applicable in various fields such as virtual reality, autonomous driving and robotic navigation. However, existing depth estimation algorithms often struggle to effectively balance performance and computational efficiency, which poses challenges for deployment on resource-constrained devices. To address this, we propose LMDepth, a lightweight Mamba-based monocular depth estimation network, designed to reconstruct high-precision depth information while maintaining low computational overhead. Specifically, we propose a modified pyramid spatial pooling module that serves as a multi-scale feature aggregator and context extractor, ensuring global spatial information for accurate depth estimation. Moreover, we integrate multiple depth Mamba blocks into the decoder. Designed with linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Advanced Vision and Imaging · Robotic Mechanisms and Dynamics
MethodsSoftmax · Attention Is All You Need · Mamba: Linear-Time Sequence Modeling with Selective State Spaces
