Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark   for Fine-grained Motor Behavior Recognition

Cheng Liu; Xuyang Yan; Zekun Zhang; Cheng Ding; Tianhao Zhao; Shaya; Jannati; Cynthia Martinez; and Dietrich Stout

arXiv:2410.08410·cs.CV·October 14, 2024

Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark for Fine-grained Motor Behavior Recognition

Cheng Liu, Xuyang Yan, Zekun Zhang, Cheng Ding, Tianhao Zhao, Shaya, Jannati, Cynthia Martinez, and Dietrich Stout

PDF

Open Access

TL;DR

HSTAG is a new, richly annotated video dataset of expert stone toolmaking, designed to challenge and advance fine-grained motor behavior recognition algorithms with its complex, rapid, and variable actions.

Contribution

The paper introduces HSTAG, a novel dataset with detailed annotations of stone toolmaking behaviors, addressing the lack of domain-specific benchmarks for complex motor actions.

Findings

01

Existing models face challenges with HSTAG's rapid, variable actions.

02

HSTAG's complexity reveals limitations of current action recognition algorithms.

03

The dataset highlights the need for more sophisticated models for fine-grained motor behavior recognition.

Abstract

Action recognition has witnessed the development of a growing number of novel algorithms and datasets in the past decade. However, the majority of public benchmarks were constructed around activities of daily living and annotated at a rather coarse-grained level, which lacks diversity in domain-specific datasets, especially for rarely seen domains. In this paper, we introduced Human Stone Toolmaking Action Grammar (HSTAG), a meticulously annotated video dataset showcasing previously undocumented stone toolmaking behaviors, which can be used for investigating the applications of advanced artificial intelligence techniques in understanding a rapid succession of complex interactions between two hand-held objects. HSTAG consists of 18,739 video clips that record 4.5 hours of experts' activities in stone toolmaking. Its unique features include (i) brief action durations and frequent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Action Observation and Synchronization · Context-Aware Activity Recognition Systems