TL;DR
This paper introduces a multi-task learning framework using graph convolutional networks for more accurate human activity segmentation and ergonomic risk assessment in long videos, addressing limitations of previous scene-dependent methods.
Contribution
It presents a novel multi-task approach combining human activity segmentation with ergonomic risk assessment using graph-based models, improving accuracy and generalizability.
Findings
Effective segmentation of activities in long videos.
Improved activity assessment accuracy.
Analysis of success and failure cases.
Abstract
We propose a new approach to Human Activity Evaluation (HAE) in long videos using graph-based multi-task modeling. Previous works in activity evaluation either directly compute a metric using a detected skeleton or use the scene information to regress the activity score. These approaches are insufficient for accurate activity assessment since they only compute an average score over a clip, and do not consider the correlation between the joints and body dynamics. Moreover, they are highly scene-dependent which makes the generalizability of these methods questionable. We propose a novel multi-task framework for HAE that utilizes a Graph Convolutional Network backbone to embed the interconnections between human joints in the features. In this framework, we solve the Human Activity Segmentation (HAS) problem as an auxiliary task to improve activity assessment. The HAS head is powered by an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Batch Normalization · 1x1 Convolution · Thinned U-shape Module
