Aligning Knowledge Graph with Visual Perception for Object-goal   Navigation

Nuo Xu; Wen Wang; Rong Yang; Mengjie Qin; Zheyuan Lin; Wei Song,; Chunlong Zhang; Jason Gu; Chao Li

arXiv:2402.18892·cs.CV·April 29, 2024·1 cites

Aligning Knowledge Graph with Visual Perception for Object-goal Navigation

Nuo Xu, Wen Wang, Rong Yang, Mengjie Qin, Zheyuan Lin, Wei Song,, Chunlong Zhang, Jason Gu, Chao Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces AKGVP, a novel method that aligns knowledge graphs with visual perception to improve object-goal navigation, achieving better scene understanding and zero-shot capabilities in simulated environments.

Contribution

It proposes a continuous knowledge graph architecture combined with visual-language pre-training to address misalignment issues in scene representation for navigation.

Findings

01

Enhanced scene understanding through continuous knowledge graphs.

02

Improved zero-shot navigation performance.

03

Effective in AI2-THOR simulation environment.

Abstract

Object-goal navigation is a challenging task that requires guiding an agent to specific objects based on first-person visual observations. The ability of agent to comprehend its surroundings plays a crucial role in achieving successful object finding. However, existing knowledge-graph-based navigators often rely on discrete categorical one-hot vectors and vote counting strategy to construct graph representation of the scenes, which results in misalignment with visual images. To provide more accurate and coherent scene descriptions and address this misalignment issue, we propose the Aligning Knowledge Graph with Visual Perception (AKGVP) method for object-goal navigation. Technically, our approach introduces continuous modeling of the hierarchical scene architecture and leverages visual-language pre-training to align natural language description with visual perception. The integration of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nuoxu/akgvp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Path Planning Algorithms · Constraint Satisfaction and Optimization · Multimodal Machine Learning Applications

MethodsALIGN