Interactive Attention AI to translate low light photos to captions for night scene understanding in women safety
Rajagopal A, Nirmala V, Arun Muthuraj Vedamanickam

TL;DR
This paper introduces an innovative deep learning model that translates low-light night scenes into descriptive captions, enhancing safety applications for visually impaired women by allowing user-guided attention focus.
Contribution
It presents the first interactive image captioning model for night scenes, enabling user-directed attention to improve environmental understanding in low-light conditions.
Findings
Model successfully generates descriptive captions for night scenes.
User interaction improves focus on specific persons of interest.
Potential to enhance safety for visually impaired women in nighttime environments.
Abstract
There is amazing progress in Deep Learning based models for Image captioning and Low Light image enhancement. For the first time in literature, this paper develops a Deep Learning model that translates night scenes to sentences, opening new possibilities for AI applications in the safety of visually impaired women. Inspired by Image Captioning and Visual Question Answering, a novel Interactive Image Captioning is developed. A user can make the AI focus on any chosen person of interest by influencing the attention scoring. Attention context vectors are computed from CNN feature vectors and user-provided start word. The Encoder-Attention-Decoder neural network learns to produce captions from low brightness images. This paper demonstrates how women safety can be enabled by researching a novel AI capability in the Interactive Vision-Language model for perception of the environment in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
