Towards Efficient and Effective Alignment of Large Language Models
Yuxin Jiang

TL;DR
This paper presents novel methods for aligning large language models more efficiently and effectively, including new data collection, training, and evaluation techniques that improve model performance and constraint adherence.
Contribution
It introduces Lion for adversarial data refinement, WebR for automated data synthesis, LTE for knowledge updates, BMC for preference modeling, and FollowBench for comprehensive evaluation.
Findings
Lion improves zero-shot reasoning capabilities.
WebR enhances data diversity and scalability.
FollowBench reveals weaknesses in current models' constraint adherence.
Abstract
Large language models (LLMs) exhibit remarkable capabilities across diverse tasks, yet aligning them efficiently and effectively with human expectations remains a critical challenge. This thesis advances LLM alignment by introducing novel methodologies in data collection, training, and evaluation. We first address alignment data collection. Existing approaches rely heavily on manually curated datasets or proprietary models. To overcome these limitations, we propose Lion, an adversarial distillation framework that iteratively refines training data by identifying and generating challenging instructions, enabling state-of-the-art zero-shot reasoning. Additionally, we introduce Web Reconstruction (WebR), a fully automated framework that synthesizes instruction-tuning data directly from raw web documents, significantly improving data diversity and scalability over existing synthetic data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques
MethodsEvolved Sign Momentum
