Mind the Shape Gap: A Benchmark and Baseline for Deformation-Aware 6D Pose Estimation of Agricultural Produce
Nikolas Chatzis, Angeliki Tsinouka, Katerina Papadimitriou, Niki Efthymiou, Marios Glytsos, George Retsinas, Paris Oikonomou, Gerasimos Potamianos, Petros Maragos, and Panagiotis Paraskevas Filntisis

TL;DR
This paper introduces PEAR, a benchmark for 6D pose and deformation estimation of agricultural produce, and proposes SEED, a method that jointly estimates pose and shape deformation from RGB images, improving accuracy in real-world scenarios.
Contribution
It provides the first benchmark with ground truth for joint pose and deformation estimation of produce, and introduces SEED, a novel RGB-only framework that models shape deformation explicitly.
Findings
State-of-the-art methods degrade up to 6x on real produce due to deformation.
SEED outperforms MegaPose on 6 out of 8 categories with RGB-only input.
Explicit shape modeling significantly improves pose estimation robustness.
Abstract
Accurate 6D pose estimation for robotic harvesting is fundamentally hindered by the biological deformability and high intra-class shape variability of agricultural produce. Instance-level methods fail in this setting, as obtaining exact 3D models for every unique piece of produce is practically infeasible, while category-level approaches that rely on a fixed template suffer significant accuracy degradation when the prior deviates from the true instance geometry. To bridge such lack of robustness to deformation, we introduce PEAR (Pose and dEformation of Agricultural pRoduce), the first benchmark providing joint 6D pose and per-instance 3D deformation ground truth across 8 produce categories, acquired via a robotic manipulator for high annotation accuracy. Using PEAR, we show that state-of-the-art methods suffer up to 6x performance degradation when faced with the inherent geometric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
