Walking the Tightrope: Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning
Xiaoyu Yang, Jie Lu, En Yu

TL;DR
This paper identifies and addresses the challenge of harmful concept drift in multi-modal large language models during non-stationary reinforcement fine-tuning, proposing a novel counterfactual approach to improve robustness and generalization.
Contribution
It introduces a theoretical framework linking concept drift theory with RFT, and proposes Counterfactual Preference Optimization for stable, non-stationary fine-tuning.
Findings
Enhanced robustness and generalization in RFT
Effective decoupling of beneficial and harmful distribution shifts
Large-scale counterfactual reasoning dataset CXR-CounterFact
Abstract
This paper uncovers a critical yet overlooked phenomenon in multi-modal large language models (MLLMs): detrimental concept drift within chain-of-thought (CoT) reasoning during non-stationary reinforcement fine-tuning (RFT), where reasoning token distributions evolve unpredictably, thereby introducing significant biases in final predictions. To address this, we are pioneers in establishing the theoretical bridge between concept drift theory and RFT processes by formalizing CoT's autoregressive token streams as non-stationary distributions undergoing arbitrary temporal shifts. Leveraging this framework, we propose a novel counterfact-aware RFT that systematically decouples beneficial distribution adaptation from harmful concept drift through concept graph-empowered LLM experts generating counterfactual reasoning trajectories. Our solution, Counterfactual Preference Optimization (CPO),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Graph Neural Networks · Reinforcement Learning in Robotics
