KeyCLD: Learning Constrained Lagrangian Dynamics in Keypoint Coordinates from Images
Rembert Daems, Jeroen Taets, Francis wyffels, Guillaume Crevecoeur

TL;DR
KeyCLD is an unsupervised framework that learns Lagrangian dynamics directly from images using keypoints, enabling accurate long-term predictions and energy-based control for various mechanical systems.
Contribution
It introduces a novel end-to-end method to learn constrained Lagrangian dynamics from images, explicitly modeling energy components and constraints.
Findings
Achieves highest valid prediction time on all benchmarks.
Successfully applies energy shaping control on fully actuated systems.
Demonstrates accurate long-term video predictions of system dynamics.
Abstract
We present KeyCLD, a framework to learn Lagrangian dynamics from images. Learned keypoints represent semantic landmarks in images and can directly represent state dynamics. We show that interpreting this state as Cartesian coordinates, coupled with explicit holonomic constraints, allows expressing the dynamics with a constrained Lagrangian. KeyCLD is trained unsupervised end-to-end on sequences of images. Our method explicitly models the mass matrix, potential energy and the input matrix, thus allowing energy based control. We demonstrate learning of Lagrangian dynamics from images on the dm_control pendulum, cartpole and acrobot environments. KeyCLD can be learned on these systems, whether they are unactuated, underactuated or fully actuated. Trained models are able to produce long-term video predictions, showing that the dynamics are accurately learned. We compare with Lag-VAE,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction
