Learning to Control Stabilization in Column Generation
Olivia Wang, Reem Khir

TL;DR
This paper introduces a reinforcement learning framework for adaptively stabilizing column generation, significantly improving convergence speed and computational efficiency on large-scale linear programming problems.
Contribution
It unifies smoothing and penalization techniques under a common framework, deriving parameter bounds and designing RLSCG for adaptive stabilization.
Findings
RLSCG reduces iteration count and computation time on most instances.
It outperforms traditional and rule-based stabilization methods.
Largest improvements observed on large-scale problems.
Abstract
Column generation is a widely used decomposition technique for large-scale linear programs, but it often suffers from slow convergence due to poor initial dual estimates and dual oscillations. Stabilization techniques such as smoothing and penalization can mitigate these issues, but their effectiveness depends heavily on parameter selection, which requires careful tuning to avoid degrading performance. This paper presents a common framework for smoothing and penalization, showing that despite their different mechanisms, both are governed by two design choices: a reference point in the dual space and stabilization parameters that regulate how strongly that reference influences pricing. Within this framework, we derive parameter bounds that ensure progress, analyze predicted duals as reference points, and establish convergence guarantees for both methods. These results motivate and guide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
