Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

Paul Strang; Zacharie Al\`es; C\^ome Bissuel; Olivier Juan; Safia Kedad-Sidhoum; Emmanuel Rachelson

arXiv:2511.09219·cs.LG·April 3, 2026

Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

Paul Strang, Zacharie Al\`es, C\^ome Bissuel, Olivier Juan, Safia Kedad-Sidhoum, Emmanuel Rachelson

PDF

1 Video

TL;DR

This paper introduces PlanB&B, a model-based reinforcement learning agent that learns to improve branching strategies in branch-and-bound algorithms for MILP problems, outperforming previous RL methods.

Contribution

It presents a novel MBRL approach that uses a learned internal model of B&B dynamics to enhance branching decisions in MILP solvers.

Findings

01

MBRL agent outperforms previous RL methods on four MILP benchmarks.

02

Learned internal model effectively captures B&B dynamics for better decision-making.

03

Experimental results demonstrate significant efficiency improvements in solving MILPs.

Abstract

Mixed-Integer Linear Programming (MILP) lies at the core of many real-world combinatorial optimization (CO) problems, traditionally solved by branch-and-bound (B&B). A key driver influencing B&B solvers efficiency is the variable selection heuristic that guides branching decisions. Looking to move beyond static, hand-crafted heuristics, recent work has explored adapting traditional reinforcement learning (RL) algorithms to the B&B setting, aiming to learn branching strategies tailored to specific MILP distributions. In parallel, RL agents have achieved remarkable success in board games, a very specific type of combinatorial problems, by leveraging environment simulators to plan via Monte Carlo Tree Search (MCTS). Building on these developments, we introduce Plan-and-Branch-and-Bound (PlanB&B), a model-based reinforcement learning (MBRL) agent that leverages a learned internal model of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization· underline