Improving Scene Graph Classification by Exploiting Knowledge from Texts
Sahand Sharifzadeh, Sina Moayed Baharlou, Martin Schmitt, Hinrich, Sch\"utze, Volker Tresp

TL;DR
This paper demonstrates that leveraging textual scene descriptions and knowledge extraction significantly improves scene graph classification accuracy, reducing the need for extensive annotated image data.
Contribution
It introduces a framework that integrates symbolic textual knowledge into scene graph classification, achieving substantial accuracy gains with minimal annotated images.
Findings
8x improvement in scene graph classification accuracy
3x improvement in object classification
1.5x improvement in predicate classification
Abstract
Training scene graph classification models requires a large amount of annotated image data. Meanwhile, scene graphs represent relational knowledge that can be modeled with symbolic data from texts or knowledge graphs. While image annotation demands extensive labor, collecting textual descriptions of natural scenes requires less effort. In this work, we investigate whether textual scene descriptions can substitute for annotated image data. To this end, we employ a scene graph classification framework that is trained not only from annotated images but also from symbolic data. In our architecture, the symbolic entities are first mapped to their correspondent image-grounded representations and then fed into the relational reasoning pipeline. Even though a structured form of knowledge, such as the form in knowledge graphs, is not always available, we can generate it from unstructured texts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Natural Language Processing Techniques
