DNA 1.0 Technical Report
Jungyup Lee, Jemin Kim, Sang Park, SeungJae Lee

TL;DR
DNA 1.0 8B Instruct is a bilingual language model optimized for Korean and English, achieving state-of-the-art results through advanced training techniques and merging strategies, and is openly available for use.
Contribution
The paper introduces DNA 1.0 8B Instruct, a novel bilingual model combining continual pre-training, supervised fine-tuning, and merging techniques to enhance Korean and English language capabilities.
Findings
State-of-the-art Korean task performance (e.g., KMMLU 53.26%)
Strong English task results (e.g., MMLU 66.64%)
Open availability of the model
Abstract
In this report, we present DNA 1.0 8B Instruct, a state-of-the-art bilingual language model optimized for Korean and English language tasks. By applying continual pre-training (CPT) with high-quality Korean datasets to Llama 3.1 8B and subsequent supervised fine-tuning (SFT), we create an instruction-following model with enhanced Korean language capabilities. This model is then merged with Llama 3.1 8B Instruct via spherical linear interpolation (SLERP) and undergoes further optimization through direct preference optimization (DPO) and knowledge distillation (KD). DNA 1.0 8B Instruct achieves state-of-the-art results on Korean-specific tasks, including KMMLU (53.26%), KoBEST (83.40%), and BELEBELE (57.99%), while maintaining strong English capabilities on MMLU (66.64%), MMLU-Pro (43.05%) and GSM8K (80.52%). As an open model, DNA 1.0 8B Instruct represents a significant advancement in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification
MethodsLLaMA · Knowledge Distillation
