Learning Asynchronous and Sparse Human-Object Interaction in Videos

Romero Morais; Vuong Le; Svetha Venkatesh; Truyen Tran

arXiv:2103.02758·cs.CV·March 5, 2021

Learning Asynchronous and Sparse Human-Object Interaction in Videos

Romero Morais, Vuong Le, Svetha Venkatesh, Truyen Tran

PDF

TL;DR

This paper introduces ASSIGN, a recurrent graph network that automatically detects and models the asynchronous and sparse interactions in videos, improving human-object interaction recognition without external segmentation.

Contribution

The paper presents a novel graph network model that learns the dynamic, asynchronous, and sparse interactions in videos for human-object activity recognition.

Findings

01

Superior performance in segmenting and labeling sub-activities.

02

Eliminates the need for external segmentation.

03

Effective modeling of asynchronous and sparse interactions.

Abstract

Human activities can be learned from video. With effective modeling it is possible to discover not only the action labels but also the temporal structures of the activities such as the progression of the sub-activities. Automatically recognizing such structure from raw video signal is a new capability that promises authentic modeling and successful recognition of human-object interactions. Toward this goal, we introduce Asynchronous-Sparse Interaction Graph Networks (ASSIGN), a recurrent graph network that is able to automatically detect the structure of interaction events associated with entities in a video scene. ASSIGN pioneers learning of autonomous behavior of video entities including their dynamic structure and their interaction with the coexisting neighbors. Entities' lives in our model are asynchronous to those of others therefore more flexible in adaptation to complex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.