Thompson sampling for improved exploration in GFlowNets

Jarrid Rector-Brooks; Kanika Madan; Moksh Jain; Maksym Korablyov,; Cheng-Hao Liu; Sarath Chandar; Nikolay Malkin; Yoshua Bengio

arXiv:2306.17693·cs.LG·July 3, 2023·1 cites

Thompson sampling for improved exploration in GFlowNets

Jarrid Rector-Brooks, Kanika Madan, Moksh Jain, Maksym Korablyov,, Cheng-Hao Liu, Sarath Chandar, Nikolay Malkin, Yoshua Bengio

PDF

Open Access

TL;DR

This paper introduces TS-GFN, a Thompson sampling approach for GFlowNets that enhances exploration efficiency and accelerates convergence by actively selecting training trajectories using Bayesian methods.

Contribution

It proposes a novel Thompson sampling algorithm for GFlowNets, systematically optimizing trajectory selection for improved exploration and faster convergence.

Findings

01

TS-GFN outperforms previous off-policy strategies in exploration.

02

Faster convergence to target distribution demonstrated in two domains.

03

Active trajectory selection improves sampling efficiency.

Abstract

Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over compositional objects as a sequential decision-making problem with a learnable action policy. Unlike other algorithms for hierarchical sampling that optimize a variational bound, GFlowNet algorithms can stably run off-policy, which can be advantageous for discovering modes of the target distribution. Despite this flexibility in the choice of behaviour policy, the optimal way of efficiently selecting trajectories for training has not yet been systematically explored. In this paper, we view the choice of trajectories for training as an active learning problem and approach it using Bayesian techniques inspired by methods for multi-armed bandits. The proposed algorithm, Thompson sampling GFlowNets (TS-GFN), maintains an approximate posterior distribution over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Reinforcement Learning in Robotics · Machine Learning and Data Classification

MethodsVariational Inference