Saliency Driven Object recognition in egocentric videos with deep CNN
Philippe P\'erez de San Roman, Jenny Benois-Pineau, Jean-Philippe, Domenger, Florent Paclet, Daniel Cataert, Aymar de Rugy

TL;DR
This paper presents a real-time object recognition framework using deep CNNs driven by saliency maps from gaze fixations, specifically designed for egocentric videos to assist upper-limb amputees in controlling neuro-prostheses.
Contribution
It introduces a novel saliency-driven approach combining gaze-based attention maps with deep CNNs for real-time object recognition in egocentric videos.
Findings
Achieved a mean Average Precision (mAP) of 64.6%.
Recognition time is faster than the duration of gaze fixation.
Framework suitable for assistive neuro-prosthetic applications.
Abstract
The problem of object recognition in natural scenes has been recently successfully addressed with Deep Convolutional Neuronal Networks giving a significant break-through in recognition scores. The computational efficiency of Deep CNNs as a function of their depth, allows for their use in real-time applications. One of the key issues here is to reduce the number of windows selected from images to be submitted to a Deep CNN. This is usually solved by preliminary segmentation and selection of specific windows, having outstanding "objectiveness" or other value of indicators of possible location of objects. In this paper we propose a Deep CNN approach and the general framework for recognition of objects in a real-time scenario and in an egocentric perspective. Here the window of interest is built on the basis of visual attention map computed over gaze fixations measured by a glass-worn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
