Reinforcement Learning Constrained Beam Search for Parameter Optimization of Paper Drying Under Flexible Constraints
Siyuan Chen, Hanshen Yu, Jamal Yagoobi, Chenhui Shao

TL;DR
This paper introduces Reinforcement Learning Constrained Beam Search (RLCBS), a novel inference-time method for optimizing process parameters in paper drying, effectively handling flexible constraints and outperforming traditional methods in speed and constraint adherence.
Contribution
The paper presents RLCBS, a new inference-time refinement technique for RL that enforces flexible constraints and improves optimization efficiency in combinatorial problems.
Findings
RLCBS outperforms NSGA-II under complex constraints.
RLCBS achieves a 2.58-fold or higher speed improvement.
RLCBS effectively incorporates flexible constraints during inference.
Abstract
Existing approaches to enforcing design constraints in Reinforcement Learning (RL) applications often rely on training-time penalties in the reward function or training/inference-time invalid action masking, but these methods either cannot be modified after training, or are limited in the types of constraints that can be implemented. To address this limitation, we propose Reinforcement Learning Constrained Beam Search (RLCBS) for inference-time refinement in combinatorial optimization problems. This method respects flexible, inference-time constraints that support exclusion of invalid actions and forced inclusion of desired actions, and employs beam search to maximize sequence probability for more sensible constraint incorporation. RLCBS is extensible to RL-based planning and optimization problems that do not require real-time solution, and we apply the method to optimize process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Numerical Analysis Techniques · Textile materials and evaluations · Material Properties and Processing
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
