TL;DR
This paper introduces kinodynamic images as a novel visual representation for contact-rich manipulation tasks, enabling interpretability of neural network decisions through visual explanations like Grad-CAM.
Contribution
It proposes a new kinodynamic image encoding for neural network interpretability in manipulation tasks, allowing visual analysis of model decisions and failure modes.
Findings
Effective visualization of model decisions in pushing and cutting tasks.
Versatile approach applicable to various manipulation scenarios.
Enhanced understanding of model behavior through visual inspection.
Abstract
Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from the kinematic and dynamic data of a contact-rich manipulation task. Our formulation visually reflects the task's state by encoding its kinodynamic variations and temporal evolution. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train Convolution-based Networks and we extract interpretations of the model's decisions with Grad-CAM, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
