Box Pose and Shape Estimation and Domain Adaptation for Large-Scale Warehouse Automation

Xihang Yu; Rajat Talak; Jingnan Shi; Ulrich Viereck; Igor Gilitschenski; Luca Carlone

arXiv:2507.00984·cs.RO·July 2, 2025

Box Pose and Shape Estimation and Domain Adaptation for Large-Scale Warehouse Automation

Xihang Yu, Rajat Talak, Jingnan Shi, Ulrich Viereck, Igor Gilitschenski, Luca Carlone

PDF

Open Access

TL;DR

This paper introduces a self-supervised domain adaptation pipeline for accurate box pose and shape estimation in warehouse automation, effectively leveraging unlabeled real-world data to enhance perception models.

Contribution

It presents a novel correct-and-certify self-supervised pipeline for box pose and shape estimation, improving performance across simulated and real industrial environments.

Findings

01

Outperforms simulation-only trained models

02

Significantly improves over zero-shot baseline

03

Effective on a large-scale dataset of 50,000 images

Abstract

Modern warehouse automation systems rely on fleets of intelligent robots that generate vast amounts of data -- most of which remains unannotated. This paper develops a self-supervised domain adaptation pipeline that leverages real-world, unlabeled data to improve perception models without requiring manual annotations. Our work focuses specifically on estimating the pose and shape of boxes and presents a correct-and-certify pipeline for self-supervised box pose and shape estimation. We extensively evaluate our approach across a range of simulated and real industrial settings, including adaptation to a large-scale real-world dataset of 50,000 images. The self-supervised model significantly outperforms models trained solely in simulation and shows substantial improvements over a zero-shot 3D bounding box estimation baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization