Playing with Food: Learning Food Item Representations through Interactive Exploration
Amrita Sawhney, Steven Lee, Kevin Zhang, Manuela Veloso, Oliver, Kroemer

TL;DR
This paper introduces a multimodal sensory approach using robotic interaction to learn and encode food item properties, improving classification and manipulation capabilities in food robotics.
Contribution
It presents a novel dataset and a learning framework that combines proprioceptive, audio, and visual data for food item representation through interaction.
Findings
Embeddings improve food material and shape classification.
Multimodal data enhances understanding of food properties.
Dataset facilitates future research in food robotics.
Abstract
A key challenge in robotic food manipulation is modeling the material properties of diverse and deformable food items. We propose using a multimodal sensory approach to interact and play with food that facilitates the ability to distinguish these properties across food items. First, we use a robotic arm and an array of sensors, which are synchronized using ROS, to collect a diverse dataset consisting of 21 unique food items with varying slices and properties. Afterwards, we learn visual embedding networks that utilize a combination of proprioceptive, audio, and visual data to encode similarities among food items using a triplet loss formulation. Our evaluations show that embeddings learned through interactions can successfully increase performance in a wide range of material and shape classification tasks. We envision that these learned embeddings can be utilized as a basis for planning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
