OED: Towards One-stage End-to-End Dynamic Scene Graph Generation

Guan Wang; Zhimin Li; Qingchao Chen; Yang Liu

arXiv:2405.16925·cs.CV·May 28, 2024·1 cites

OED: Towards One-stage End-to-End Dynamic Scene Graph Generation

Guan Wang, Zhimin Li, Qingchao Chen, Yang Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces OED, a one-stage end-to-end framework for dynamic scene graph generation in videos, improving efficiency and temporal context modeling over traditional multi-stage methods.

Contribution

It proposes a novel set prediction approach with pair-wise features and a PRM module for better temporal dependency capture, enabling fully end-to-end training.

Findings

01

Outperforms existing methods on Action Genome benchmark

02

Effectively models temporal dependencies without additional trackers

03

Streamlines DSGG into a single end-to-end trainable framework

Abstract

Dynamic Scene Graph Generation (DSGG) focuses on identifying visual relationships within the spatial-temporal domain of videos. Conventional approaches often employ multi-stage pipelines, which typically consist of object detection, temporal association, and multi-relation classification. However, these methods exhibit inherent limitations due to the separation of multiple stages, and independent optimization of these sub-problems may yield sub-optimal solutions. To remedy these limitations, we propose a one-stage end-to-end framework, termed OED, which streamlines the DSGG pipeline. This framework reformulates the task as a set prediction problem and leverages pair-wise features to represent each subject-object pair within the scene graph. Moreover, another challenge of DSGG is capturing temporal dependencies, we introduce a Progressively Refined Module (PRM) for aggregating temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guanw-pku/oed
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Topic Modeling · Human Motion and Animation

MethodsSparse Evolutionary Training