PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes

Ahmed Abdelreheem; Filippo Aleotti; Jamie Watson; Zawar Qureshi; Abdelrahman Eldesokey; Peter Wonka; Gabriel Brostow; Sara Vicente; Guillermo Garcia-Hernando

arXiv:2505.05288·cs.CV·October 3, 2025

PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes

Ahmed Abdelreheem, Filippo Aleotti, Jamie Watson, Zawar Qureshi, Abdelrahman Eldesokey, Peter Wonka, Gabriel Brostow, Sara Vicente, Guillermo Garcia-Hernando

PDF

Open Access

TL;DR

This paper introduces a new task and benchmark for language-guided object placement in 3D scenes, involving reasoning about geometry and space, with a dataset and baseline method to evaluate models.

Contribution

It presents the first task, dataset, and evaluation protocol for language-guided object placement in 3D scenes, advancing 3D language understanding.

Findings

01

Proposed a new benchmark and evaluation protocol.

02

Created a dataset for training 3D language models.

03

Developed the first baseline method for the task.

Abstract

We introduce the novel task of Language-Guided Object Placement in Real 3D Scenes. Our model is given a 3D scene's point cloud, a 3D asset, and a textual prompt broadly describing where the 3D asset should be placed. The task here is to find a valid placement for the 3D asset that respects the prompt. Compared with other language-guided localization tasks in 3D scenes such as grounding, this task has specific challenges: it is ambiguous because it has multiple valid solutions, and it requires reasoning about 3D geometric relationships and free space. We inaugurate this task by proposing a new benchmark and evaluation protocol. We also introduce a new dataset for training 3D LLMs on this task, as well as the first method to serve as a non-trivial baseline. We believe that this challenging task and our new benchmark could become part of the suite of benchmarks used to evaluate and compare…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · 3D Shape Modeling and Analysis · Robot Manipulation and Learning