A Computer Vision and Depth Sensor-Powered Smart Cane for Real-Time Obstacle Detection and Navigation Assistance for the Visually Impaired

Sunkalp Chandra; Umang Sharma; Devesh Khilnani

arXiv:2508.16698·q-bio.OT·August 26, 2025

A Computer Vision and Depth Sensor-Powered Smart Cane for Real-Time Obstacle Detection and Navigation Assistance for the Visually Impaired

Sunkalp Chandra, Umang Sharma, Devesh Khilnani

PDF

TL;DR

This paper presents the IoT Cane, a smart navigation aid for the visually impaired that combines real-time computer vision, depth sensing, and haptic/audio feedback, outperforming traditional ultrasound-based canes in obstacle detection.

Contribution

Introduction of a novel IoT-enabled smart cane integrating transformer-based vision models and depth sensors for improved obstacle detection and navigation assistance.

Findings

01

Achieved 53.4% mAP and 71.7% AP50 on challenging datasets.

02

End-to-end latency of approximately 150 ms per frame.

03

Outperforms similar ultrasound-based systems in obstacle detection accuracy.

Abstract

Visual impairment impacts more than 2.2 billion people worldwide, and it greatly restricts independent mobility and access. Conventional mobility aids - white canes and ultrasound-based intelligent canes - are inherently limited in the feedback they can offer and generally will not be able to differentiate among types of obstacles in dense or complex environments. Here, we introduce the IoT Cane, an internet of things assistive navigation tool that integrates real-time computer vision with a transformer-based RT-DETRv3-R50 model alongside depth sensing through the Intel RealSense camera. Our prototype records a mAP of 53.4% and an AP50 of 71.7% when tested on difficult datasets with low Intersection over Union (IoU) boundaries, outperforming similar ultrasound-based systems. Latency in end-to-end mode is around 150 ms per frame, accounting for preprocessing (1-3 ms), inference (50-70…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.