STRIDE: Automating Reward Design, Deep Reinforcement Learning Training   and Feedback Optimization in Humanoid Robotics Locomotion

Zhenwei Wu; Jinxiong Lu; Yuxiao Chen; Yunxin Liu; Yueting Zhuang,; Luhui Hu

arXiv:2502.04692·cs.RO·February 13, 2025

STRIDE: Automating Reward Design, Deep Reinforcement Learning Training and Feedback Optimization in Humanoid Robotics Locomotion

Zhenwei Wu, Jinxiong Lu, Yuxiao Chen, Yunxin Liu, Yueting Zhuang,, Luhui Hu

PDF

Open Access

TL;DR

STRIDE automates reward design and training in humanoid robot locomotion using agentic engineering and large language models, significantly improving efficiency and performance over existing methods.

Contribution

The paper introduces STRIDE, a novel framework that automates reward creation and optimization in humanoid robotics using LLMs, reducing manual effort and enhancing DRL outcomes.

Findings

01

STRIDE outperforms EUREKA with 250% efficiency improvement.

02

Humanoid robots achieve sprint-level locomotion in complex terrains.

03

Automated reward design accelerates DRL training processes.

Abstract

Humanoid robotics presents significant challenges in artificial intelligence, requiring precise coordination and control of high-degree-of-freedom systems. Designing effective reward functions for deep reinforcement learning (DRL) in this domain remains a critical bottleneck, demanding extensive manual effort, domain expertise, and iterative refinement. To overcome these challenges, we introduce STRIDE, a novel framework built on agentic engineering to automate reward design, DRL training, and feedback optimization for humanoid robot locomotion tasks. By combining the structured principles of agentic engineering with large language models (LLMs) for code-writing, zero-shot generation, and in-context optimization, STRIDE generates, evaluates, and iteratively refines reward functions without relying on task-specific prompts or templates. Across diverse environments featuring humanoid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning