Changing Model Behavior at Test-Time Using Reinforcement Learning
Augustus Odena, Dieterich Lawson, Christopher Olah

TL;DR
This paper introduces a method to adapt model behavior at test-time using reinforcement learning, enabling resource management based on input-specific constraints, demonstrated on a MNIST example.
Contribution
It presents a mixture-of-experts model that dynamically adjusts test-time resource usage through reinforcement learning, a novel approach for resource-constrained inference.
Findings
Effective test-time resource adaptation demonstrated on MNIST
Reinforcement learning successfully modulates model behavior based on input
Potential for real-time, resource-aware model deployment
Abstract
Machine learning models are often used at test-time subject to constraints and trade-offs not present at training-time. For example, a computer vision model operating on an embedded device may need to perform real-time inference, or a translation model operating on a cell phone may wish to bound its average compute time in order to be power-efficient. In this work we describe a mixture-of-experts model and show how to change its test-time resource-usage on a per-input basis using reinforcement learning. We test our method on a small MNIST-based example.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Data Stream Mining Techniques · Machine Learning and Algorithms
