Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain
Zhuangyu Han, Abhronil Sengupta

TL;DR
This paper introduces an equilibrium-propagation-based PPO framework for quadruped locomotion that enables energy-efficient, adaptive control on uneven terrain by replacing traditional backpropagation with local learning.
Contribution
It develops a novel EP-compatible PPO method with a bio-inspired CPG policy and residual adjustments, facilitating stable, efficient on-robot adaptation for complex locomotion tasks.
Findings
Achieves stable policy convergence on uneven terrain in experiments.
Matches baseline performance in success rate, velocity, and stability.
Reduces GPU memory usage by 4.3 times compared to BPTT.
Abstract
Reinforcement learning (RL) has enabled robust quadruped locomotion over complex terrain, but most learned controllers are trained offline with backpropagation in massively parallel simulation and deployed as fixed policies, limiting adaptation to terrain variation, payload changes, actuator wear, and other real-world conditions under onboard power constraints. Local learning provides a potential path toward energy-aware on-robot adaptation by replacing global backpropagation graphs with updates driven by local neural states, making the learning rule more compatible with neuromorphic and in-memory computing substrates. This work proposes an equilibrium-propagation (EP)-based proximal policy optimization (PPO) framework for uneven-terrain quadruped locomotion. The controller combines a bio-inspired central pattern generator (CPG) policy with a residual postural adjustment policy, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
