RANDPOL: Parameter-Efficient End-to-End Quadruped Locomotion via Randomized Policy Learning

Zhuochen Liu; Rahul Jain; Quan Nguyen

arXiv:2505.19054·cs.LG·April 16, 2026

RANDPOL: Parameter-Efficient End-to-End Quadruped Locomotion via Randomized Policy Learning

Zhuochen Liu, Rahul Jain, Quan Nguyen

PDF

TL;DR

RANDPOL introduces a parameter-efficient end-to-end quadruped locomotion controller by fixing hidden layers and only training a linear readout, achieving competitive performance with fewer trainable parameters.

Contribution

The paper proposes RANDPOL, a novel randomized policy learning method that significantly reduces trainable parameters while maintaining effective quadruped locomotion control.

Findings

01

RANDPOL achieves comparable locomotion performance to PPO with fewer parameters.

02

RANDPOL enables faster learning iterations due to reduced optimization complexity.

03

Successful zero-shot sim-to-real transfer on physical quadruped demonstrates practical effectiveness.

Abstract

Modern learning-based locomotion controllers typically rely on fully trainable deep neural networks with a large number of parameters. This paper studies a different design point for end-to-end control: whether effective quadruped locomotion can be achieved with a drastically reduced trainable parameter space. We present RANDomized POlicy Learning (RANDPOL), a policy learning approach in which the hidden layers of the actor and critic are randomly initialized and fixed, while only the final linear readout is trained. This yields a parameter-efficient controller class that retains nonlinear expressiveness through a fixed random basis while substantially reducing the dimension of the optimization problem. RANDPOL is supported by the mathematical foundation of randomized function approximation, which provides a principled basis for using fixed random nonlinear features as expressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.