Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference
Yizhi Liu

TL;DR
This paper introduces an adaptive annealing method called EPH-ASC to stabilize entropy-regularized optimal transport inference, preventing premature mode collapse and improving large-scale structural prediction.
Contribution
The work identifies the cause of instability in entropy-regularized OT and proposes a novel adaptive scheduling algorithm to enhance stability during training.
Findings
EPH-ASC effectively prevents mode collapse during training.
The method stabilizes large-scale hyper-connection models on the FineWeb-Edu dataset.
It enforces a linear stability law to avoid gradient explosions.
Abstract
Differentiable matching layers and residual connection paradigms, often implemented via entropy-regularized Optimal Transport (OT), serve as critical mechanisms in structural prediction and architectural scaling. However, recovering discrete permutations or maintaining identity mappings via annealing is notoriously unstable. In this work, we identify a fundamental mechanism for this failure: \textbf{Premature Mode Collapse}. By analyzing the non-normal dynamics of the Sinkhorn fixed-point map, we reveal a theoretical thermodynamic speed limit: standard exponential cooling outpaces the contraction rate of the inference operator, which degrades as . To address this, we propose \textbf{Efficient Piecewise Hybrid Adaptive Stability Control (EPH-ASC)}, an adaptive scheduling algorithm that monitors the stability of the inference process. We demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Machine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis
