Improving Generalization of Deep Networks for Estimating Physical Properties of Containers and Fillings
Hengyi Wang, Chaoran Zhu, Ziyin Ma, Changjae Oh

TL;DR
This paper introduces a multimodal approach using a lightweight neural network with data augmentation and consistency measurement to improve the generalization of physical property estimation of household containers and fillings, especially for unseen objects.
Contribution
It proposes a novel combination of audio-video data, data augmentation, and consistency measurement to enhance model generalization for estimating container properties.
Findings
Improved accuracy in estimating container capacity, dimensions, and mass.
Enhanced generalization to unseen containers in the CCM dataset.
Effective filling type and level classification using combined audio-video data.
Abstract
We present methods to estimate the physical properties of household containers and their fillings manipulated by humans. We use a lightweight, pre-trained convolutional neural network with coordinate attention as a backbone model of the pipelines to accurately locate the object of interest and estimate the physical properties in the CORSMAL Containers Manipulation (CCM) dataset. We address the filling type classification with audio data and then combine this information from audio with video modalities to address the filling level classification. For the container capacity, dimension, and mass estimation, we present a data augmentation and consistency measurement to alleviate the over-fitting issue in the CCM dataset caused by the limited number of containers. We augment the training data using an object-of-interest-based re-scaling that increases the variety of physical values of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Advanced Neural Network Applications · Industrial Vision Systems and Defect Detection
MethodsCoordinate attention
