Integrating Lagrangian Neural Networks into the Dyna Framework for Reinforcement Learning
Shreya Das, Kundan Kumar, Muhammad Iqbal, Outi Savolainen, Dominik Baumann, Laura Ruotsalainen, and Simo S\"arkk\"a

TL;DR
This paper introduces a Lagrangian neural network integrated into the Dyna framework for model-based reinforcement learning, improving physical consistency and training efficiency.
Contribution
It presents a novel approach combining Lagrangian neural networks with Dyna-based MBRL, enhancing model accuracy and training convergence.
Findings
LNN-based Dyna framework improves physical accuracy of learned models.
State-estimation-based training converges faster than stochastic gradient methods.
Simulation results demonstrate effectiveness of the proposed approach.
Abstract
Model-based reinforcement learning (MBRL) is sample-efficient but depends on the accuracy of the learned dynamics, which are often modeled using black-box methods that do not adhere to physical laws. Those methods tend to produce inaccurate predictions when presented with data that differ from the original training set. In this work, we employ Lagrangian neural networks (LNNs), which enforce an underlying Lagrangian structure to train the model within a Dyna-based MBRL framework. Furthermore, we train the LNN using stochastic gradient-based and state-estimation-based optimizers to learn the network's weights. The state-estimation-based method converges faster than the stochastic gradient-based method during neural network training. Simulation results are provided to illustrate the effectiveness of the proposed LNN-based Dyna framework for MBRL.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Neural Networks and Reservoir Computing
