HyperCLOVA X Technical Report
Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook, Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak,, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee,, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek

TL;DR
HyperCLOVA X is a large language model tailored for Korean and capable in English, math, and coding, trained on multilingual data, and evaluated across diverse benchmarks demonstrating strong reasoning and cross-lingual abilities.
Contribution
The paper introduces HyperCLOVA X, a multilingual LLM with a focus on Korean, highlighting its training process, safety measures, and extensive evaluation across various tasks and languages.
Findings
Strong reasoning and cultural understanding in Korean.
Effective cross-lingual and multilingual generalization.
High performance on diverse benchmarks.
Abstract
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
