ZORMS-LfD: Learning from Demonstrations with Zeroth-Order Random Matrix Search
Olivia Dry, Timothy L. Molloy, Wanxin Jin, Iman Shames

TL;DR
ZORMS-LfD introduces a zeroth-order optimization method for learning costs, constraints, and dynamics from demonstrations in continuous and discrete time, avoiding the need for gradient computations and improving efficiency.
Contribution
It presents ZORMS-LfD, a novel zeroth-order method that effectively learns from demonstrations without requiring smoothness or gradients, applicable to both continuous and discrete-time constrained problems.
Findings
Matches or surpasses state-of-the-art in learning loss and compute time.
Achieves over 80% reduction in compute time on unconstrained problems.
Outperforms Nelder-Mead on constrained continuous-time benchmarks.
Abstract
We propose Zeroth-Order Random Matrix Search for Learning from Demonstrations (ZORMS-LfD). ZORMS-LfD enables the costs, constraints, and dynamics of constrained optimal control problems, in both continuous and discrete time, to be learned from expert demonstrations without requiring smoothness of the learning-loss landscape. In contrast, existing state-of-the-art first-order methods require the existence and computation of gradients of the costs, constraints, dynamics, and learning loss with respect to states, controls and/or parameters. Most existing methods are also tailored to discrete time, with constrained problems in continuous time receiving only cursory attention. We demonstrate that ZORMS-LfD matches or surpasses the performance of state-of-the-art methods in terms of both learning loss and compute time across a variety of benchmark problems. On unconstrained continuous-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Algorithms and Data Compression
