A Novel Self-Evolution Framework for Large Language Models
Haoran Sun, Zekun Zhang, Shaoning Zeng

TL;DR
This paper introduces a Dual-Phase Self-Evolution framework for Large Language Models that enhances domain knowledge and user preference alignment through structured data expansion and a two-stage fine-tuning process.
Contribution
It presents a novel self-evolution approach with a Censor module and a dual-phase pipeline, improving LLMs' domain cognition and user alignment beyond existing post-training methods.
Findings
DPSE outperforms baseline methods on NLP benchmarks.
The framework improves domain-specific competence and user preference adaptation.
Ablation studies confirm the effectiveness of each module.
Abstract
The capabilities of Large Language Models (LLMs) are limited to some extent by pre-training, so some researchers optimize LLMs through post-training. Existing post-training strategies, such as memory-based retrieval or preference optimization, improve user alignment yet fail to enhance the model's domain cognition. To bridge this gap, we propose a novel Dual-Phase Self-Evolution (DPSE) framework that jointly optimizes user preference adaptation and domain-specific competence. DPSE introduces a Censor module to extract multi-dimensional interaction signals and estimate satisfaction scores, which guide structured data expansion via topic-aware and preference-driven strategies. These expanded datasets support a two-stage fine-tuning pipeline: supervised domain grounding followed by frequency-aware preference optimization. Experiments across general NLP benchmarks and long-term dialogue…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
