TL;DR
This paper empirically evaluates an active inference algorithm for multi-armed bandit problems, comparing its performance to traditional algorithms in stationary and switching environments, highlighting its strengths in dynamic settings.
Contribution
It introduces an efficient approximate active inference algorithm and demonstrates its superior performance in non-stationary bandit problems compared to existing methods.
Findings
Active inference performs poorly in stationary bandits.
It significantly outperforms in switching bandit scenarios.
Results support active inference as a promising approach for dynamic decision problems.
Abstract
A key feature of sequential decision making under uncertainty is a need to balance between exploiting--choosing the best action according to the current knowledge, and exploring--obtaining information about values of other actions. The multi-armed bandit problem, a classical task that captures this trade-off, served as a vehicle in machine learning for developing bandit algorithms that proved to be useful in numerous industrial applications. The active inference framework, an approach to sequential decision making recently developed in neuroscience for understanding human and animal behaviour, is distinguished by its sophisticated strategy for resolving the exploration-exploitation trade-off. This makes active inference an exciting alternative to already established bandit algorithms. Here we derive an efficient and scalable approximate active inference algorithm and compare it to two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
