Cooking in the kitchen: Recognizing and Segmenting Human Activities in Videos
Hilde Kuehne, Juergen Gall, Thomas Serre

TL;DR
This paper presents an end-to-end generative approach using Fisher vectors and temporal models for recognizing complex human activities in videos, showing significant improvements over simpler models in real-world datasets.
Contribution
It introduces a novel combination of Fisher vector encoding with HMM-based temporal modeling for activity recognition in videos.
Findings
Structured temporal models outperform bag-of-word models with sufficient training data.
Fisher vectors combined with HMMs significantly improve accuracy in activity recognition.
The approach is validated on multiple real-world datasets with strong results.
Abstract
As research on action recognition matures, the focus is shifting away from categorizing basic task-oriented actions using hand-segmented video datasets to understanding complex goal-oriented daily human activities in real-world settings. Temporally structured models would seem obvious to tackle this set of problems, but so far, cases where these models have outperformed simpler unstructured bag-of-word types of models are scarce. With the increasing availability of large human activity datasets, combined with the development of novel feature coding techniques that yield more compact representations, it is time to revisit structured generative approaches. Here, we describe an end-to-end generative approach from the encoding of features to the structural modeling of complex human activities by applying Fisher vectors and temporal models for the analysis of video sequences. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Multimodal Machine Learning Applications
