A Post-Training Enhanced Optimization Approach for Small Language Models

Keke Zhai

arXiv:2411.02939·cs.CL·December 24, 2024

A Post-Training Enhanced Optimization Approach for Small Language Models

Keke Zhai

PDF

Open Access

TL;DR

This paper introduces a novel continuous post-training optimization method for small language models, utilizing large model guidance to enhance data diversity and accuracy, leading to significant performance improvements.

Contribution

It proposes a new data construction and optimization approach for post-training small language models, validated through experiments with the Qwen2-0.5B-Instruct model.

Findings

01

Post-training optimization significantly improves model performance.

02

The proposed data guidance enhances alignment data quality.

03

Experimental results confirm effectiveness across multiple training strategies.

Abstract

This paper delves into the continuous post-training optimization methods for small language models, and proposes a continuous post-training alignment data construction method for small language models. The core of this method is based on the data guidance of large models, optimizing the diversity and accuracy of alignment data. In addition, to verify the effectiveness of the methods in this paper, we used Qwen2-0.5B-Instruct model as the baseline model for small language models, using the alignment dataset constructed by our proposed method, we trained and compared several groups of experiments, including SFT (Supervised Fine Tuning) post-training experiment and KTO (Kahneman Tversky optimization) post-training experiment, as well as SFT-KTO two-stage post-training experiment and model weight fusion experiment. Finally, we evaluated and analyzed the performance of post-training models,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsShrink and Fine-Tune