Contextual Exploration Using a Linear Approximation Method Based on Satisficing
Akane Minami, Yu Kono, and Tatsuji Takahashi

TL;DR
This paper introduces LinRS, a linear approximation method based on satisficing principles, which reduces exploration and runtime in reinforcement learning tasks, potentially benefiting complex environments.
Contribution
The paper proposes LinRS, a linear extension of risk-sensitive satisficing, to improve exploration efficiency in reinforcement learning by approximating action values and selection proportions.
Findings
LinRS reduces exploration in contextual bandit problems.
LinRS decreases runtime compared to existing algorithms.
Satisficing-based methods may enhance deep reinforcement learning in complex environments.
Abstract
Deep reinforcement learning has enabled human-level or even super-human performance in various types of games. However, the amount of exploration required for learning is often quite large. Deep reinforcement learning also has super-human performance in that no human being would be able to achieve such amounts of exploration. To address this problem, we focus on the \textit{satisficing} policy, which is a qualitatively different approach from that of existing optimization algorithms. Thus, we propose Linear RS (LinRS), which is a type of satisficing algorithm and a linear extension of risk-sensitive satisficing (RS), for application to a wider range of tasks. The generalization of RS provides an algorithm to reduce the volume of exploratory actions by adopting a different approach from existing optimization algorithms. LinRS utilizes linear regression and multiclass classification to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDecision-Making and Behavioral Economics
MethodsLinear Regression
