Learning Principle of Least Action with Reinforcement Learning
Zehao Jin, Joshua Yao-Yu Lin, Siao-Fong Li

TL;DR
This paper introduces a reinforcement learning approach to physics, where agents learn physical trajectories by optimizing the action integral, demonstrating the method on light refraction and connecting it to classical principles.
Contribution
The paper proposes a novel reinforcement learning framework that incorporates the principle of least action to learn physical paths, verified through light refraction experiments.
Findings
Agent recovers minimal-time paths consistent with Fermat's principle
Reinforcement learning aligns with classical physics laws
Method connects path integral formalism with RL
Abstract
Nature provides a way to understand physics with reinforcement learning since nature favors the economical way for an object to propagate. In the case of classical mechanics, nature favors the object to move along the path according to the integral of the Lagrangian, called the action . We consider setting the reward/penalty as a function of , so the agent could learn the physical trajectory of particles in various kinds of environments with reinforcement learning. In this work, we verified the idea by using a Q-Learning based algorithm on learning how light propagates in materials with different refraction indices, and show that the agent could recover the minimal-time path equivalent to the solution obtained by Snell's law or Fermat's Principle. We also discuss the similarity of our reinforcement learning approach to the path integral formalism.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Reinforcement Learning in Robotics · Metaheuristic Optimization Algorithms Research
MethodsQ-Learning
