Target Adaptive Context Aggregation for Video Scene Graph Generation
Yao Teng, Limin Wang, Zhifeng Li, Gangshan Wu

TL;DR
This paper introduces TRACE, a novel framework for video scene graph generation that captures spatio-temporal context efficiently, achieving state-of-the-art results by decoupling relation prediction from entity tracking.
Contribution
The paper proposes a new detect-to-track paradigm with a hierarchical relation tree and target-adaptive context aggregation for improved VidSGG.
Findings
Achieves state-of-the-art performance on VidSGG benchmarks.
Introduces a modular framework with HRTree and context aggregation blocks.
Demonstrates effectiveness on ImageNet-VidVRD and Action Genome datasets.
Abstract
This paper deals with a challenging task of video scene graph generation (VidSGG), which could serve as a structured video representation for high-level understanding tasks. We present a new {\em detect-to-track} paradigm for this task by decoupling the context modeling for relation prediction from the complicated low-level entity tracking. Specifically, we design an efficient method for frame-level VidSGG, termed as {\em Target Adaptive Context Aggregation Network} (TRACE), with a focus on capturing spatio-temporal context information for relation recognition. Our TRACE framework streamlines the VidSGG pipeline with a modular design, and presents two unique blocks of Hierarchical Relation Tree (HRTree) construction and Target-adaptive Context Aggregation. More specific, our HRTree first provides an adpative structure for organizing possible relation candidates efficiently, and guides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Machine Learning in Healthcare
