Trajectory Optimization for Robust Humanoid Locomotion with Sample-Efficient Learning
Majid Khadiv, Mohammad Hasan Yeganegi, S. Ali A. Moosavian, Jia-Jie, Zhu, and Ludovic Righetti

TL;DR
This paper introduces a sample-efficient Bayesian optimization approach to enhance the robustness of humanoid robot trajectories against uncertainties, achieving reliable motions with minimal simulation or experimental data.
Contribution
It presents a novel method combining trajectory optimization with Bayesian optimization to efficiently find robust motion parameters for humanoid robots.
Findings
Successfully generates robust motions under various disturbances.
Achieves robustness with fewer simulations or experiments.
Demonstrates effectiveness across different uncertainty scenarios.
Abstract
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to an interactable problem. Furthermore, since the models used in the TO have always some level of abstraction, it is hard to find a realistic set of uncertainty in the space of abstract model. In this paper we aim at leveraging a sample-efficient learning technique (Bayesian optimization) to robustify trajectory optimization for humanoid locomotion. The main idea is to use Bayesian optimization to find the optimal set of cost weights which compromises performance with respect to robustness with a few realistic simulation/experiment. The results show that the proposed approach is able to generate robust motions for different set of disturbances and uncertainties.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Robot Manipulation and Learning · Robotic Mechanisms and Dynamics
Trajectory Optimization for Robust Humanoid Locomotion with Sample-Efficient Learning*
Majid Khadiv1, Mohammad Hasan Yeganegi2, S. Ali A. Moosavian2, Jia-Jie Zhu1 and Ludovic Righetti1,3 This work is supported by New York University, the Max-Planck Society, the European Union’s Horizon 2020 research and innovation program (grant agreement No 780684 and European Research Council’s grant No 637935) and the National Science Foundation (grant CMMI-1825993)1 Max Planck Institute for Intelligent Systems, Tuebingen, Germany. [email protected]2* K. N. Toosi University of Technology, Tehran, Iran [email protected]@kntu.ac.ir3 Tandon School of Engineering, New York University, New York, USA. [email protected]
Abstract
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to an interactable problem. Furthermore, since the models used in the TO have always some level of abstraction, it is hard to find a realistic set of uncertainty in the space of abstract model. In this paper we aim at leveraging a sample-efficient learning technique (Bayesian optimization) to robustify trajectory optimization for humanoid locomotion. The main idea is to use Bayesian optimization to find the optimal set of cost weights which compromises performance with respect to robustness with a few realistic simulation/experiment. The results show that the proposed approach is able to generate robust motions for different set of disturbances and uncertainties.
I INTRODUCTION
Since humanoid robots are inherently both redundant (so many degrees of freedom in the limbs’ structure) and underactuated (floating base without direct actuation), generating feasible and optimal motions for them is very challenging. Trajectory optimization (TO) is a strong tool to take into account all the physical and geometrical constraints and at the same time yield the optimal motion minimizing the cost function. However, the discrepancy between the model used in the TO problem and the real robot, as well as uncertainties in the constraints set make the generated motion fragile.
One principled way to deal with this problem is to use stochastic or robust TO approaches to add the notion of uncertainties to the problem. Although systematic, this approach suffers from two problems: 1) identifying different types of uncertainties and projecting them to a realistic set of constraints is challenging and 2) adding stochastic uncertainties can easily lead to an intractable problem, and in most cases can be solved only for simplified worst case scenarios.
The main contribution of this work is to combine the strength of trajectory optimization and sample-efficient learning (Bayesian optimization) to generate robust motion for different kinds of uncertainties with a low number of experiments. Contrary to the available approaches for generating bipedal locomotion using BO [1, 2], we propose to use constrained gradient-based approaches to generate feasible motion using the robot abstract model, and then use BO to tune the cost parameters of the TO problem in the presence of disturbances during simulation of the full robot. By proper formulation of the problem in this setting, we can trade-off robustness against performance, given a set of realistic disturbances/uncertainties in the full robot simulation. Fig. 1 shows the block diagram of our approach.
II Optimal control problem
II-A First Stage : Convex Trajectory Optimization for Walking
We use the TO approach proposed in [3] which is an extended version of a standard walking pattern generator [4] to compute center of mass (CoM) motion and footstep locations to achieve a desired walking velocity. In this approach, walking is formulated as a trade-off among three main cost terms: one for the task goal (desired velocity tracking) and two to increase motion robustness (foot tip-over avoidance and slippage avoidance).
[TABLE]
where is the CoM position in horizontal plane. is the zero moment point (ZMP) position and is the required coefficient of friction (RCoF). is the desired walking velocity, is the desired ZMP which is considered at the center of the foot to generate maximum feasibility margin. As it is shown in [3], this optimization problem can be written as a quadratic program, assuming linear inverted pendulum dynamics and a polyhedral approximation of the friction cone.
II-B Second Stage : iLQG for generating whole body torques
We use an iLQG controller to map the desired CoM and feet trajectories to whole body torques, taking into account box inequality constraints on controls [5]. We use iLQG (with a short horizon of 0.2 sec) as a whole body controller to track the desired trajectories from the first stage. For our walking problem, we use a 27-DoF humanoid robot model and the iLQG computes joint torques every 0.01 sec.
III Computing cost weights for robust humanoid locomotion
We propose to close the loop of the system and automatically optimize the cost function of the abstract pattern generator to find plans that are robust to full robot dynamics and environment uncertainties. We formulate an overall optimization problem based on the quantities in Fig. 1 (and solve it using BO):
[TABLE]
is the collection of the hyper-parameters used in optimization problem (II-A). We set to guarantee the viability of the gait [4], and optimize for and to find the best trade-off between robustness and performance, where robustness depends on the environment. is the measured CoM velocity resulting from applying the control (i.e. solving the QP in (II-A) and applying iLQG tracking) in simulation under unknown disturbances. is CoM height at terminal time. is a trade-off parameter which is straight forward to set. penalizes robot falling, e.g., .
IV Preliminary results
In this section we show that our approach can adapt cost functions to generate robust walking gaits in the presence of various disturbances. The range of weight values we consider is . In all cases we start with where we have high ZMP and RCoF margins. We investigate four cases, i. e. (a) without disturbance, (b) with external pushes on upper body, (c) unknown decrease of the surface friction coefficient, and (d) both external forces and decrease of friction coefficient.
In Fig. 3LABEL:sub@original_cost we plot the evolution of cost values of BO at each iteration for all cases of this scenario. Large variations of the values correspond to the high penalty given to the failed cases. As expected, the decrease in the cost is not monotonic, as we are not using gradient-based optimization. In Fig. 3LABEL:sub@minimum_cost, we plot the minimum value of the current cost and all last calls for evaluating the function. The optimal values for each case are: (a) , (b) , (c) (d) . Interestingly, for all the cases after a few iterations (around 15 calls for cases (b), (c), (d), and after 25 calls for case (a)), the cost has already settled. This suggests that within a few experiments the cost function can be optimized and lead to robust walking in uncertain environments, which is important for deployment on real humanoid robots where experiments can be time consuming.
V Ongoing and future work
These results are a proof of concept where we tuned two cost variables using BO. We are currently applying the concept to more complicated problems: using this approach to optimize more complex costs weights for RCoF and ZMP in sagittal and lateral directions as well as costs on step locations being far from boundaries of reachable area. A longer term goal of this project is to apply the approach to more complicated TO problems based on centroidal dynamics and full-body dynamics.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Rai, R. Antonova, S. Song, W. Martin, H. Geyer, and C. Atkeson, “Bayesian optimization using domain knowledge on the atrias biped,” in 2018 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2018, pp. 1771–1778.
- 2[2] R. Calandra, “Bayesian modeling for optimization and control in robotics,” Ph.D. dissertation, Technische Universität, 2017.
- 3[3] M. Khadiv, S. A. A. Moosavian, A. Herzog, and L. Righetti, “Pattern generation for walking on slippery terrains,” in 2017 5th RSI International Conference on Robotics and Mechatronics (IC Ro M) . IEEE, 2017, pp. 120–125.
- 4[4] A. Herdt, H. Diedam, P.-B. Wieber, D. Dimitrov, K. Mombaur, and M. Diehl, “Online walking motion generation with automatic footstep placement,” Advanced Robotics , vol. 24, no. 5-6, pp. 719–737, 2010.
- 5[5] Y. Tassa, N. Mansard, and E. Todorov, “Control-limited differential dynamic programming,” in Robotics and Automation (ICRA), 2014 IEEE International Conference on . IEEE, 2014, pp. 1168–1175.
