The Laplacian Keyboard: Beyond the Linear Span
Siddarth Chandrasekar, Marlos C. Machado

TL;DR
The paper introduces the Laplacian Keyboard, a hierarchical framework that extends the expressive power of Laplacian eigenvector-based policies in reinforcement learning, enabling better zero-shot control and policy learning.
Contribution
It develops a behavior library from Laplacian eigenvectors that guarantees optimality within the linear span and a meta-policy to learn beyond this span.
Findings
LK improves zero-shot approximation accuracy.
LK achieves better sample efficiency than standard RL methods.
Theoretical bounds on approximation error are established.
Abstract
Across scientific disciplines, Laplacian eigenvectors serve as a fundamental basis for simplifying complex systems, from signal processing to quantum mechanics. In reinforcement learning (RL), they similarly form a basis over the state space, enabling reward functions to be approximated by projection onto a small set of eigenvectors. This projection makes zero-shot control possible, but it also imposes a fundamental limitation: the induced policies are only as expressive as the linear span of the chosen eigenvectors. We introduce the Laplacian Keyboard (LK), a hierarchical framework that goes beyond this linear span. LK constructs a task-agnostic library of behaviors from these eigenvectors, forming a behavior basis guaranteed to contain the optimal policy for any reward within the linear span. A meta-policy learns to stitch these behaviors dynamically, enabling efficient learning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural Networks and Reservoir Computing · Domain Adaptation and Few-Shot Learning
