FragileFlow: Spectral Control of Correct-but-Fragile Predictions for Foundation Model Robustness

Zhuoyun Li; Boxuan Wang; Jinwei Hu; Xiaowei Huang; Yi Dong

arXiv:2605.08896·cs.CL·May 12, 2026

FragileFlow: Spectral Control of Correct-but-Fragile Predictions for Foundation Model Robustness

Zhuoyun Li, Boxuan Wang, Jinwei Hu, Xiaowei Huang, Yi Dong

PDF

TL;DR

FragileFlow is a novel regularizer that improves foundation model robustness by controlling spectral properties and identifying predictions vulnerable to systematic errors near decision boundaries.

Contribution

It introduces a formal margin-aware error flow concept, a spectral regularizer, and provides the first PAC-Bayes bound for this robustness measure.

Findings

01

FragileFlow improves worst-class accuracy in multiple benchmarks.

02

It preserves clean accuracy while enhancing robustness.

03

Theoretical bounds support the effectiveness of spectral control.

Abstract

Robust adaptation of LLMs and VLMs is often evaluated by average accuracy or average consistency under perturbations. However, these averages can hide a structured failure mode: a prediction may remain correct while probability mass already flows from particular true classes toward systematic wrong competitors near the decision boundary. In this paper, we formalize this phenomenon as margin-aware error flow and introduce FragileFlow, a plug-in regularizer that uses a calibrated margin buffer to identify correct-but-fragile predictions and organize their off-class probability mass into a class-wise vulnerable-risk matrix. Theoretically, we provide the first PAC-Bayes upper bound for this margin-aware error-flow object, showing how empirical spectral control yields a conservative route to deterministic worst-class robustness under a stability condition. Experiments on multiple-choice LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.