Multiscale Sensor Fusion and Continuous Control with Neural CDEs
Sumeet Singh, Francis McCann Ramirez, Jacob Varley, Andy Zeng, Vikas, Sindhwani

TL;DR
This paper introduces InFuser, a continuous-time policy learning architecture using Neural CDEs that effectively integrates asynchronous multi-sensory data for robot control, outperforming traditional discrete methods.
Contribution
The paper presents a novel neural architecture, InFuser, that models latent state dynamics continuously to fuse multi-sensory data for improved robot policy learning.
Findings
InFuser outperforms baselines in dynamic tasks with asynchronous sensory data.
It enables continuous-time reactive policies without fixed-time discretization.
Demonstrates robustness in tasks like swinging a ball into a cup.
Abstract
Though robot learning is often formulated in terms of discrete-time Markov decision processes (MDPs), physical robots require near-continuous multiscale feedback control. Machines operate on multiple asynchronous sensing modalities, each with different frequencies, e.g., video frames at 30Hz, proprioceptive state at 100Hz, force-torque data at 500Hz, etc. While the classic approach is to batch observations into fixed-time windows then pass them through feed-forward encoders (e.g., with deep networks), we show that there exists a more elegant approach -- one that treats policy learning as modeling latent state dynamics in continuous-time. Specifically, we present 'InFuser', a unified architecture that trains continuous time-policies with Neural Controlled Differential Equations (CDEs). InFuser evolves a single latent state representation over time by (In)tegrating and (Fus)ing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
