Scientific Discovery and the Cost of Measurement -- Balancing Information and Cost in Reinforcement Learning
Colin Bellinger, Andriy Drozdyuk, Mark Crowley, Isaac Tamblyn

TL;DR
This paper introduces a framework for reinforcement learning that explicitly accounts for measurement costs, enabling agents to learn policies that balance information gathering with cost reduction in scientific applications.
Contribution
It proposes a novel approach integrating measurement costs into RL, allowing off-the-shelf algorithms to learn cost-effective policies for scientific tasks.
Findings
Dueling DQN and PPO agents reduce measurements by up to 50%.
Recurrent neural networks achieve over 50% reduction in measurements.
The framework facilitates practical RL application in costly scientific environments.
Abstract
The use of reinforcement learning (RL) in scientific applications, such as materials design and automated chemistry, is increasing. A major challenge, however, lies in fact that measuring the state of the system is often costly and time consuming in scientific applications, whereas policy learning with RL requires a measurement after each time step. In this work, we make the measurement costs explicit in the form of a costed reward and propose a framework that enables off-the-shelf deep RL algorithms to learn a policy for both selecting actions and determining whether or not to measure the current state of the system at each time step. In this way, the agents learn to balance the need for information with the cost of information. Our results show that when trained under this regime, the Dueling DQN and PPO agents can learn optimal action policies whilst making up to 50\% fewer state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Machine Learning in Materials Science · Fuel Cells and Related Materials
MethodsQ-Learning · Convolution · Entropy Regularization · Proximal Policy Optimization · Dense Connections · Deep Q-Network
