Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Long Chen, Hanwang Zhang, Jun Xiao, Xiangnan He, Shiliang Pu, Shih-Fu, Chang

TL;DR
This paper introduces CMAT, a multi-agent training method for scene graph generation that directly optimizes graph-level metrics, addressing limitations of traditional supervised learning approaches.
Contribution
It proposes a novel counterfactual critic multi-agent training framework that improves scene graph generation by better modeling scene dynamics and relationships.
Findings
Achieves state-of-the-art results on Visual Genome benchmark.
Significant performance improvements across various metrics.
Effectively models scene dynamics through multi-agent policy gradient.
Abstract
Scene graphs -- objects as nodes and visual relationships as edges -- describe the whereabouts and interactions of the things and stuff in an image for comprehensive scene understanding. To generate coherent scene graphs, almost all existing methods exploit the fruitful visual context by modeling message passing among objects, fitting the dynamic nature of reasoning with visual context, eg, "person" on "bike" can help to determine the relationship "ride", which in turn contributes to the category confidence of the two objects. However, we argue that the scene dynamics is not properly learned by using the prevailing cross-entropy based supervised learning paradigm, which is not sensitive to graph inconsistency: errors at the hub or non-hub nodes are unfortunately penalized equally. To this end, we propose a Counterfactual critic Multi-Agent Training (CMAT) approach to resolve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
