Optimizing Simulations with Noise-Tolerant Structured Exploration

Krzysztof Choromanski; Atil Iscen; Vikas Sindhwani; Jie Tan; Erwin; Coumans

arXiv:1805.07831·cs.RO·May 22, 2018

Optimizing Simulations with Noise-Tolerant Structured Exploration

Krzysztof Choromanski, Atil Iscen, Vikas Sindhwani, Jie Tan, Erwin, Coumans

PDF

TL;DR

This paper introduces a noise-tolerant structured finite difference method using orthogonal matrices and FWHT/FFT, improving gradient approximation and optimization efficiency in blackbox and control tasks.

Contribution

It presents a novel structured exploration approach with theoretical bounds, enhancing gradient estimation and control policy learning in noisy blackbox optimization.

Findings

01

Higher quality gradient approximations with minimal computational cost.

02

Fewer iterations needed for trajectory optimization in control tasks.

03

Successful transfer of learned policies from simulation to hardware.

Abstract

We propose a simple drop-in noise-tolerant replacement for the standard finite difference procedure used ubiquitously in blackbox optimization. In our approach, parameter perturbation directions are defined by a family of structured orthogonal matrices. We show that at the small cost of computing a Fast Walsh-Hadamard/Fourier Transform (FWHT/FFT), such structured finite differences consistently give higher quality approximation of gradients and Jacobians in comparison to vanilla approaches that use coordinate directions or random Gaussian perturbations. We find that trajectory optimizers like Iterative LQR and Differential Dynamic Programming require fewer iterations to solve several classic continuous control tasks when our methods are used to linearize noisy, blackbox dynamics instead of standard finite differences. By embedding structured exploration in a quasi-Newton optimizer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.