Pessimistic Backward Policy for GFlowNets

Hyosoon Jang; Yunhui Jang; Minsu Kim; Jinkyoo Park; and Sungsoo Ahn

arXiv:2405.16012·cs.LG·October 30, 2024

Pessimistic Backward Policy for GFlowNets

Hyosoon Jang, Yunhui Jang, Minsu Kim, Jinkyoo Park, and Sungsoo Ahn

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces PBP-GFN, a pessimistic backward policy for GFlowNets that improves sampling of high-reward objects by addressing under-exploitation issues, and demonstrates superior performance across diverse benchmarks.

Contribution

The paper proposes a novel pessimistic backward policy for GFlowNets that enhances high-reward object discovery and maintains diversity, outperforming existing methods.

Findings

01

PBP-GFN outperforms existing methods on multiple benchmarks.

02

It improves the discovery rate of high-reward objects.

03

It maintains diversity in generated objects.

Abstract

This paper studies Generative Flow Networks (GFlowNets), which learn to sample objects proportionally to a given reward function through the trajectory of state transitions. In this work, we observe that GFlowNets tend to under-exploit the high-reward objects due to training on insufficient number of trajectories, which may lead to a large gap between the estimated flow and the (known) reward value. In response to this challenge, we propose a pessimistic backward policy for GFlowNets (PBP-GFN), which maximizes the observed flow to align closely with the true reward for the object. We extensively evaluate PBP-GFN across eight benchmarks, including hyper-grid environment, bag generation, structured set generation, molecular generation, and four RNA sequence generation tasks. In particular, PBP-GFN enhances the discovery of high-reward objects, maintains the diversity of the objects, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hsjang0/Pessimistic-Backward-Policy-for-GFlowNets
pytorchOfficial

Videos

Pessimistic Backward Policy for GFlowNets· slideslive

Taxonomy

TopicsParallel Computing and Optimization Techniques · Caching and Content Delivery · Opportunistic and Delay-Tolerant Networks

MethodsALIGN · Sparse Evolutionary Training