LIEREx: Language-Image Embeddings for Robotic Exploration
Felix Igelbrink, Lennart Niecksch, Marian Renz, Martin G\"unther, Martin Atzmueller

TL;DR
LIEREx leverages vision-language foundation models to create flexible semantic maps for robotic exploration, enabling open-set object recognition and improved navigation in unknown environments.
Contribution
This work integrates vision-language foundation models with 3D semantic scene graphs to enhance robotic exploration capabilities beyond fixed object classes.
Findings
Enables open-set object recognition in robotic mapping.
Improves target-directed exploration in unknown environments.
Integrates VLFMs with 3D semantic scene graphs.
Abstract
Semantic maps allow a robot to reason about its surroundings to fulfill tasks such as navigating known environments, finding specific objects, and exploring unmapped areas. Traditional mapping approaches provide accurate geometric representations but are often constrained by pre-designed symbolic vocabularies. The reliance on fixed object classes makes it impractical to handle out-of-distribution knowledge not defined at design time. Recent advances in Vision-Language Foundation Models, such as CLIP, enable open-set mapping, where objects are encoded as high-dimensional embeddings rather than fixed labels. In LIEREx, we integrate these VLFMs with established 3D Semantic Scene Graphs to enable target-directed exploration by an autonomous agent in partially unknown environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Advanced Neural Network Applications
