Task-Optimal Exploration in Linear Dynamical Systems
Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

TL;DR
This paper develops a task-guided exploration method for linear dynamical systems, achieving optimal sample complexity and demonstrating improvements over non-task-aware strategies in reinforcement learning and control tasks.
Contribution
It introduces a computationally efficient experiment-design based exploration algorithm that is instance- and task-optimal for linear dynamical systems, including the LQR problem.
Findings
Proposes a task-guided exploration algorithm with finite-time optimal bounds.
Shows task-aware exploration improves over non-task-aware schemes.
Establishes certainty equivalence as instance- and task-optimal in this setting.
Abstract
Exploration in unknown environments is a fundamental problem in reinforcement learning and control. In this work, we study task-guided exploration and determine what precisely an agent must learn about their environment in order to complete a particular task. Formally, we study a broad class of decision-making problems in the setting of linear dynamical systems, a class that includes the linear quadratic regulator problem. We provide instance- and task-dependent lower bounds which explicitly quantify the difficulty of completing a task of interest. Motivated by our lower bound, we propose a computationally efficient experiment-design based exploration algorithm. We show that it optimally explores the environment, collecting precisely the information needed to complete the task, and provide finite-time bounds guaranteeing that it achieves the instance- and task-optimal sample complexity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Advanced Control Systems Optimization
