Monte Carlo Policy Gradient Method for Binary Optimization
Cheng Chen, Ruitao Chen, Tianyou Li, Ruichen Ao, Zaiwen Wen

TL;DR
This paper introduces a Monte Carlo policy gradient approach for binary optimization problems, leveraging probabilistic models, MCMC sampling, and local search to efficiently find near-optimal solutions.
Contribution
It develops a novel stochastic optimization framework combining policy gradients, MCMC sampling, and local search for binary combinatorial optimization.
Findings
Effective in solving MaxCut, MIMO detection, MaxSAT.
Converges to stationary points with theoretical guarantees.
Produces near-optimal solutions in numerical experiments.
Abstract
Binary optimization has a wide range of applications in combinatorial optimization problems such as MaxCut, MIMO detection, and MaxSAT. However, these problems are typically NP-hard due to the binary constraints. We develop a novel probabilistic model to sample the binary solution according to a parameterized policy distribution. Specifically, minimizing the KL divergence between the parameterized policy distribution and the Gibbs distributions of the function value leads to a stochastic optimization problem whose policy gradient can be derived explicitly similar to reinforcement learning. For coherent exploration in discrete spaces, parallel Markov Chain Monte Carlo (MCMC) methods are employed to sample from the policy distribution with diversity and approximate the gradient efficiently. We further develop a filter scheme to replace the original objective function by the one with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · Optimization and Search Problems · Distributed Sensor Networks and Detection Algorithms
