Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
Yiye Chen, Harpreet Sawhney, Nicholas Gyd\'e, Yanan Jian, Jack Saunders, Patricio Vela, Ben Lundell

TL;DR
This paper introduces SG^2, a multi-agent LLM framework for scene graph reasoning that improves accuracy and reduces hallucinations by iterative, schema-guided reasoning and retrieval in spatial tasks.
Contribution
The work presents a novel multi-agent, schema-guided reasoning framework that enhances scene graph reasoning with iterative collaboration, outperforming existing LLM-based approaches.
Findings
Outperforms existing LLM-based scene graph reasoning methods.
Reduces hallucination by schema-guided information retrieval.
Improves accuracy in numerical Q&A and planning tasks.
Abstract
Scene graphs have emerged as a structured and serializable environment representation for grounded spatial reasoning with Large Language Models (LLMs). In this work, we propose SG^2, an iterative Schema-Guided Scene-Graph reasoning framework based on multi-agent LLMs. The agents are grouped into two modules: a (1) Reasoner module for abstract task planning and graph information queries generation, and a (2) Retriever module for extracting corresponding graph information based on code-writing following the queries. Two modules collaborate iteratively, enabling sequential reasoning and adaptive attention to graph information. The scene graph schema, prompted to both modules, serves to not only streamline both reasoning and retrieval process, but also guide the cooperation between two modules. This eliminates the need to prompt LLMs with full graph data, reducing the chance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Semantic Web and Ontologies · Advanced Graph Neural Networks
MethodsSoftmax · Attention Is All You Need
