MVRackLay: Monocular Multi-View Layout Estimation for Warehouse Racks   and Shelves

Pranjali Pathre; Anurag Sahu; Ashwin Rao; Avinash Prabhu; Meher; Shashwat Nigam; Tanvi Karandikar; Harit Pandya; and K. Madhava Krishna

arXiv:2211.16882·cs.CV·December 8, 2022

MVRackLay: Monocular Multi-View Layout Estimation for Warehouse Racks and Shelves

Pranjali Pathre, Anurag Sahu, Ashwin Rao, Avinash Prabhu, Meher, Shashwat Nigam, Tanvi Karandikar, Harit Pandya, and K. Madhava Krishna

PDF

Open Access

TL;DR

This paper introduces MVRackLay, a novel monocular multi-view approach for detailed 3D warehouse layout estimation, accurately modeling racks, shelves, and objects from image sequences, outperforming single-view methods.

Contribution

MVRackLay is the first method to estimate multi-layered warehouse layouts from monocular images, enabling detailed 3D scene rendering with superior accuracy.

Findings

01

MVRackLay outperforms single-view RackLay in accuracy.

02

The method generalizes across diverse warehouse scenes.

03

It enables 3D scene rendering from monocular image sequences.

Abstract

In this paper, we propose and showcase, for the first time, monocular multi-view layout estimation for warehouse racks and shelves. Unlike typical layout estimation methods, MVRackLay estimates multi-layered layouts, wherein each layer corresponds to the layout of a shelf within a rack. Given a sequence of images of a warehouse scene, a dual-headed Convolutional-LSTM architecture outputs segmented racks, the front and the top view layout of each shelf within a rack. With minimal effort, such an output is transformed into a 3D rendering of all racks, shelves and objects on the shelves, giving an accurate 3D depiction of the entire warehouse scene in terms of racks, shelves and the number of objects on each shelf. MVRackLay generalizes to a diverse set of warehouse scenes with varying number of objects on each shelf, number of shelves and in the presence of other such racks in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Image and Video Stabilization