Tsetlin Machine for Solving Contextual Bandit Problems
Raihan Seraj, Jivitesh Sharma, Ole-Christoffer Granmo

TL;DR
This paper presents an interpretable Tsetlin Machine-based algorithm for contextual bandit problems, demonstrating superior performance on multiple datasets and providing insights into decision interpretability through propositional logic.
Contribution
It introduces a novel Tsetlin Machine approach for contextual bandits, integrating Thompson sampling and emphasizing interpretability and computational simplicity.
Findings
Outperforms other learners on 8 of 9 datasets
Provides interpretable propositional logic explanations
Simplifies computation with bit manipulation
Abstract
This paper introduces an interpretable contextual bandit algorithm using Tsetlin Machines, which solves complex pattern recognition tasks using propositional logic. The proposed bandit learning algorithm relies on straightforward bit manipulation, thus simplifying computation and interpretation. We then present a mechanism for performing Thompson sampling with Tsetlin Machine, given its non-parametric nature. Our empirical analysis shows that Tsetlin Machine as a base contextual bandit learner outperforms other popular base learners on eight out of nine datasets. We further analyze the interpretability of our learner, investigating how arms are selected based on propositional expressions that model the context.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Machine Learning and Data Classification
MethodsBalanced Selection
