You-Do, I-Learn: Unsupervised Multi-User egocentric Approach Towards Video-Based Guidance
Dima Damen, Teesid Leelasawassuk, Walterio Mayol-Cuevas

TL;DR
This paper introduces an unsupervised egocentric video analysis method that automatically extracts object usage guidance from multi-user data, enabling assistive task support and object interaction understanding.
Contribution
It presents a novel unsupervised multi-user approach for extracting object usage guidance from egocentric videos, including object discovery, modeling, and interaction dependency analysis.
Findings
Effective object discovery and modeling from egocentric videos.
Successful online and offline guidance generation.
Demonstrated on daily tasks like printer setup and coffee preparation.
Abstract
This paper presents an unsupervised approach towards automatically extracting video-based guidance on object usage, from egocentric video and wearable gaze tracking, collected from multiple users while performing tasks. The approach i) discovers task relevant objects, ii) builds a model for each, iii) distinguishes different ways in which each discovered object has been used and iv) discovers the dependencies between object interactions. The work investigates using appearance, position, motion and attention, and presents results using each and a combination of relevant features. Moreover, an online scalable approach is presented and is compared to offline results. The paper proposes a method for selecting a suitable video guide to be displayed to a novice user indicating how to use an object, purely triggered by the user's gaze. The potential assistive mode can also recommend an object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · Augmented Reality Applications
