Learning rigid-body simulators over implicit shapes for large-scale scenes and vision
Yulia Rubanova, Tatiana Lopez-Guevara, Kelsey R. Allen, William F., Whitney, Kimberly Stachenfeld, Tobias Pfaff

TL;DR
This paper introduces SDF-Sim, a scalable learned rigid-body simulator using signed-distance functions to efficiently model large scenes with many objects, outperforming mesh-based methods in scale and real-world applicability.
Contribution
The paper presents the first scalable GNN-based rigid-body simulator leveraging SDFs, enabling simulation of scenes with hundreds of objects and real-world scene application.
Findings
Scales to scenes with over 1 million nodes.
Outperforms mesh-based approaches in large-scale scenarios.
Can be applied to real-world scenes using multi-view images.
Abstract
Simulating large scenes with many rigid objects is crucial for a variety of applications, such as robotics, engineering, film and video games. Rigid interactions are notoriously hard to model: small changes to the initial state or the simulation parameters can lead to large changes in the final state. Recently, learned simulators based on graph networks (GNNs) were developed as an alternative to hand-designed simulators like MuJoCo and PyBullet. They are able to accurately capture dynamics of real objects directly from real-world observations. However, current state-of-the-art learned simulators operate on meshes and scale poorly to scenes with many objects or detailed shapes. Here we present SDF-Sim, the first learned rigid-body simulator designed for scale. We use learned signed-distance functions (SDFs) to represent the object shapes and to speed up distance computation. We design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Advanced Vision and Imaging · Robotic Mechanisms and Dynamics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
