Multimodal Sensor Fusion with Differentiable Filters
Michelle A. Lee, Brent Yi, Roberto Mart\'in-Mart\'in, Silvio Savarese,, Jeannette Bohg

TL;DR
This paper introduces new differentiable filtering architectures for multimodal sensor fusion, demonstrating their effectiveness in robotic manipulation tasks and providing an open-source library for end-to-end learning of sensor models.
Contribution
It proposes novel differentiable filter architectures for fusing heterogeneous sensors and evaluates them on manipulation tasks, highlighting interpretability and comparable accuracy to LSTM models.
Findings
Differentiable filters achieve accuracy similar to LSTM models.
Fusion of visual and tactile data improves state estimation.
Open-source library facilitates development of learned Bayesian filters.
Abstract
Leveraging multimodal information with recursive Bayesian filters improves performance and robustness of state estimation, as recursive filters can combine different modalities according to their uncertainties. Prior work has studied how to optimally fuse different sensor modalities with analytical state estimation algorithms. However, deriving the dynamics and measurement models along with their noise profile can be difficult or lead to intractable models. Differentiable filters provide a way to learn these models end-to-end while retaining the algorithmic structure of recursive filters. This can be especially helpful when working with sensor modalities that are high dimensional and have very different characteristics. In contact-rich manipulation, we want to combine visual sensing (which gives us global information) with tactile sensing (which gives us local information). In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
