TL;DR
This paper presents LBW-Guard, a control layer that improves language model training stability and efficiency under stress by observing telemetry and applying bounded control without replacing the optimizer.
Contribution
Introduction of LBW-Guard, a novel autonomous control layer that enhances training stability and speed without replacing the optimizer, especially under aggressive training conditions.
Findings
LBW-Guard reduces perplexity by 18.7% on WikiText-103.
It speeds up training by 1.10x compared to baseline.
LBW-Guard maintains training stability under high learning-rate stress.
Abstract
Modern language-model training is increasingly exposed to instability, degraded runs, and wasted compute, especially under aggressive learning-rate, scale, and runtime-stress conditions. This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded autonomous training-control governance layer that operates above AdamW. Rather than replacing the optimizer update rule, LBW-Guard observes training telemetry, interprets instability-sensitive regimes, and applies bounded control to optimizer execution while preserving fixed training objectives. We evaluate LBW-Guard in a Qwen2.5-centered stress-and-robustness suite using WikiText-103, with Qwen2.5-7B as the empirical anchor, model-size comparisons against Qwen2.5-3B and Qwen2.5-14B, learning-rate stress tests, gradient-clipping baselines, and a no-LoRA TinyLlama-1B full-parameter sanity check. In the 7B reference setting, LBW-Guard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
