Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze
I. Hipiny, H. Ujir, J.L. Minoi, S.F. Samson Juan, M.A. Khairuddin,, M.S. Sunar

TL;DR
This paper introduces an unsupervised method for segmenting action segments in egocentric videos by leveraging gaze data and motion parameters, improving the identification of natural activity boundaries.
Contribution
The work presents a novel gaze-based approach that uses simple motion parameters and entropy measures to automatically segment egocentric videos without supervision.
Findings
Effective segmentation on BRISGAZE-ACTIONS dataset
Improved temporal cut quality with entropy measures
Demonstrated applicability to daily-living activities
Abstract
Unsupervised segmentation of action segments in egocentric videos is a desirable feature in tasks such as activity recognition and content-based video retrieval. Reducing the search space into a finite set of action segments facilitates a faster and less noisy matching. However, there exist a substantial gap in machine understanding of natural temporal cuts during a continuous human activity. This work reports on a novel gaze-based approach for segmenting action segments in videos captured using an egocentric camera. Gaze is used to locate the region-of-interest inside a frame. By tracking two simple motion-based parameters inside successive regions-of-interest, we discover a finite set of temporal cuts. We present several results using combinations (of the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains egocentric videos depicting several daily-living…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
