Multi-modal On-Device Learning for Monocular Depth Estimation on Ultra-low-power MCUs
Davide Nadalini, Manuele Rusci, Elia Cereda, Luca Benini, Francesco Conti, Daniele Palossi

TL;DR
This paper introduces a multi-modal on-device learning method for monocular depth estimation on ultra-low-power IoT devices, enabling real-time adaptation to new environments with minimal energy and memory use.
Contribution
It presents a novel on-device training scheme with a memory-efficient sparse update, allowing accurate depth estimation adaptation directly on IoT hardware.
Findings
Achieves 2% and 1.5% accuracy drops on KITTI and NYUv2 datasets.
Reduces RMSE from 4.9m to 0.6m in 17.8 minutes.
Uses only 3,000 self-labeled samples for effective in-field adaptation.
Abstract
Monocular depth estimation (MDE) plays a crucial role in enabling spatially-aware applications in Ultra-low-power (ULP) Internet-of-Things (IoT) platforms. However, the limited number of parameters of Deep Neural Networks for the MDE task, designed for IoT nodes, results in severe accuracy drops when the sensor data observed in the field shifts significantly from the training dataset. To address this domain shift problem, we present a multi-modal On-Device Learning (ODL) technique, deployed on an IoT device integrating a Greenwaves GAP9 MicroController Unit (MCU), a 80 mW monocular camera and a 8 x 8 pixel depth sensor, consuming 300mW. In its normal operation, this setup feeds a tiny 107 k-parameter PyD-Net model with monocular images for inference. The depth sensor, usually deactivated to minimize energy consumption, is only activated alongside the camera to collect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Optical Sensing Technologies · Robotics and Sensor-Based Localization
