MVRackLay: Monocular Multi-View Layout Estimation for Warehouse Racks and Shelves
Pranjali Pathre, Anurag Sahu, Ashwin Rao, Avinash Prabhu, Meher, Shashwat Nigam, Tanvi Karandikar, Harit Pandya, and K. Madhava Krishna

TL;DR
This paper introduces MVRackLay, a novel monocular multi-view approach for detailed 3D warehouse layout estimation, accurately modeling racks, shelves, and objects from image sequences, outperforming single-view methods.
Contribution
MVRackLay is the first method to estimate multi-layered warehouse layouts from monocular images, enabling detailed 3D scene rendering with superior accuracy.
Findings
MVRackLay outperforms single-view RackLay in accuracy.
The method generalizes across diverse warehouse scenes.
It enables 3D scene rendering from monocular image sequences.
Abstract
In this paper, we propose and showcase, for the first time, monocular multi-view layout estimation for warehouse racks and shelves. Unlike typical layout estimation methods, MVRackLay estimates multi-layered layouts, wherein each layer corresponds to the layout of a shelf within a rack. Given a sequence of images of a warehouse scene, a dual-headed Convolutional-LSTM architecture outputs segmented racks, the front and the top view layout of each shelf within a rack. With minimal effort, such an output is transformed into a 3D rendering of all racks, shelves and objects on the shelves, giving an accurate 3D depiction of the entire warehouse scene in terms of racks, shelves and the number of objects on each shelf. MVRackLay generalizes to a diverse set of warehouse scenes with varying number of objects on each shelf, number of shelves and in the presence of other such racks in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Image and Video Stabilization
