Self-Specialization: Uncovering Latent Expertise within Large Language   Models

Junmo Kang; Hongyin Luo; Yada Zhu; Jacob Hansen; James Glass; David; Cox; Alan Ritter; Rogerio Feris; Leonid Karlinsky

arXiv:2310.00160·cs.CL·June 7, 2024·1 cites

Self-Specialization: Uncovering Latent Expertise within Large Language Models

Junmo Kang, Hongyin Luo, Yada Zhu, Jacob Hansen, James Glass, David, Cox, Alan Ritter, Rogerio Feris, Leonid Karlinsky

PDF

Open Access 1 Video

TL;DR

This paper introduces self-specialization, a method for efficiently adapting large language models to specific expert domains like biomedicine and finance, outperforming generalist models and other adaptation techniques.

Contribution

It proposes a novel self-specialization approach that enables effective domain-specific model tuning with minimal data and parameters, improving upon existing instruction-tuning methods.

Findings

01

Self-specialized models outperform base models in biomedical and financial tasks.

02

Self-specialization achieves significant performance gains with few labeled seeds.

03

The method surpasses instruction-tuned and domain-adapted models in experiments.

Abstract

Recent works have demonstrated the effectiveness of self-alignment in which a large language model is aligned to follow general instructions using instructional data generated from the model itself starting from a handful of human-written seeds. Instead of general alignment, in this work, we focus on self-alignment for expert domain specialization (e.g., biomedicine, finance). As a preliminary, we quantitively show the marginal effect that generic instruction-following training has on downstream expert domains' performance. To remedy this, we propose self-specialization - allowing for effective model specialization while achieving cross-task generalization by leveraging only a few labeled seeds. Self-specialization offers a data- and parameter-efficient way of "carving out" an expert model out of a generalist pre-trained LLM. Exploring a variety of popular open large models as a base…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Self-Specialization: Uncovering Latent Expertise within Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsBalanced Selection · Focus