Automated Gadget Discovery in Science
Lea M. Trenkwalder, Andrea L\'opez Incera, Hendrik Poulsen Nautrup,, Fulvio Flamini, Hans J. Briegel

TL;DR
This paper introduces a method to interpret reinforcement learning agents in scientific applications by extracting and clustering frequent subroutines, called gadgets, to understand their learned behaviors.
Contribution
It presents a novel post-hoc analysis technique using sequence mining and clustering to identify meaningful subroutines in RL policies across different scientific domains.
Findings
Gadgets correspond to real experimental setups in quantum optics.
The method successfully identifies quantum information processing subroutines.
Applicable to various agent architectures and environments.
Abstract
In recent years, reinforcement learning (RL) has become increasingly successful in its application to science and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent's learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Neural Networks and Reservoir Computing · Complex Systems and Time Series Analysis
