Trajectory Aligned Features For First Person Action Recognition
Suriya Singh, Chetan Arora, C. V. Jawahar

TL;DR
This paper introduces a novel trajectory-based feature representation for first person action recognition in egocentric videos, improving accuracy without requiring hand/object segmentation.
Contribution
The proposed method uses simple point tracking to derive features that outperform existing techniques by over 11% in accuracy on public datasets.
Findings
Achieved over 11% performance improvement on public datasets.
Effective in recognizing actions even without visible hands or objects.
Does not require segmentation of hands or objects.
Abstract
Egocentric videos are characterised by their ability to have the first person view. With the popularity of Google Glass and GoPro, use of egocentric videos is on the rise. Recognizing action of the wearer from egocentric videos is an important problem. Unstructured movement of the camera due to natural head motion of the wearer causes sharp changes in the visual field of the egocentric camera causing many standard third person action recognition techniques to perform poorly on such videos. Objects present in the scene and hand gestures of the wearer are the most important cues for first person action recognition but are difficult to segment and recognize in an egocentric video. We propose a novel representation of the first person actions derived from feature trajectories. The features are simple to compute using standard point tracking and does not assume segmentation of hand/objects…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Multimodal Machine Learning Applications
