A scene perception system for visually impaired based on object detection and classification using multi-modal DCNN
Baljit Kaur, Jhilik Bhattacharya

TL;DR
This paper presents a cost-effective, multi-modal deep learning system that detects and classifies objects in outdoor scenes to assist visually impaired individuals, providing audio feedback based on real-time scene analysis.
Contribution
It introduces a novel multi-modal fusion framework using Faster R-CNN for improved object detection and classification in a wearable system for the visually impaired.
Findings
Effective object detection in outdoor traffic scenes
Real-time classification and distance estimation
Voice output system for user assistance
Abstract
This paper represents a cost-effective scene perception system aimed towards visually impaired individual. We use an odroid system integrated with an USB camera and USB laser that can be attached on the chest. The system classifies the detected objects along with its distance from the user and provides a voice output. Experimental results provided in this paper use outdoor traffic scenes. The object detection and classification framework exploits a multi-modal fusion based faster RCNN using motion, sharpening and blurring filters for efficient feature representation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
