Semantic Enhancement for Object SLAM with Heterogeneous Multimodal Large Language Model Agents
Jungseok Hong, Ran Choi, John J. Leonard

TL;DR
SEO-SLAM enhances object SLAM by integrating heterogeneous multimodal large language model agents, improving semantic accuracy and efficiency in cluttered indoor environments, and demonstrating benefits for downstream robotic tasks.
Contribution
The paper introduces a novel framework that combines heterogeneous MLLM agents with an asynchronous processing scheme and a multi-data association strategy for improved semantic SLAM.
Findings
Higher semantic accuracy and fewer false positives compared to baselines.
Significant reduction in inference time with asynchronous agents.
Potential improvements in downstream robotic assistance tasks.
Abstract
Object Simultaneous Localization and Mapping (SLAM) systems struggle to correctly associate semantically similar objects in close proximity, especially in cluttered indoor environments and when scenes change. We present Semantic Enhancement for Object SLAM (SEO-SLAM), a novel framework that enhances semantic mapping by integrating heterogeneous multimodal large language model (MLLM) agents. Our method enables scene adaptation while maintaining a semantically rich map. To improve computational efficiency, we propose an asynchronous processing scheme that significantly reduces the agents' inference time without compromising semantic accuracy or SLAM performance. Additionally, we introduce a multi-data association strategy using a cost matrix that combines semantic and Mahalanobis distances, formulating the problem as a Linear Assignment Problem (LAP) to alleviate perceptual aliasing.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Semantic Web and Ontologies · Robotic Path Planning Algorithms
