LEXI-SG: Monocular 3D Scene Graph Mapping with Room-Guided Feed-Forward Reconstruction

Christina Kassab; Hyeonjae Gil; Mat\'ias Mattamala; Ayoung Kim; Maurice Fallon

arXiv:2605.13741·cs.RO·May 14, 2026

LEXI-SG: Monocular 3D Scene Graph Mapping with Room-Guided Feed-Forward Reconstruction

Christina Kassab, Hyeonjae Gil, Mat\'ias Mattamala, Ayoung Kim, Maurice Fallon

PDF

1 Repo

TL;DR

LEXI-SG is a novel monocular RGB-based system that creates dense, open-vocabulary 3D scene graphs for indoor environments, enabling scalable and accurate scene understanding for robot navigation.

Contribution

It introduces the first monocular visual mapping system for open-vocabulary 3D scene graphs, leveraging semantic priors and a room-based factor graph formulation.

Findings

01

Improved trajectory estimation and dense reconstruction over existing methods.

02

Achieves competitive open-vocabulary segmentation performance.

03

Demonstrates scalable dense mapping without sliding-window inconsistencies.

Abstract

Scene graphs are becoming a standard representation for robot navigation, providing hierarchical geometric and semantic scene understanding. However, most scene graph mapping methods rely on depth cameras or LiDAR sensors. In this work, we present LEXI-SG, the first dense monocular visual mapping system for open-vocabulary 3D scene graphs using only RGB camera input. Our approach exploits the semantic priors of open-vocabulary foundation models to partition the scene into rooms, deferring feed-forward reconstruction to when each room is fully observed -- enabling scalable dense mapping without sliding-window scale inconsistencies. We propose a room-based factor graph formulation to globally align room reconstructions while preserving local map consistency and naturally imposing the semantic scene graph hierarchy. Within each room, we further support open-vocabulary object segmentation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://ori-drs.github.io/lexisg-web
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.