Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard
Chanjun Park, Hyeonwoo Kim

TL;DR
This longitudinal study over eleven months analyzes the development and performance trends of Korean large language models on the Open Ko-LLM Leaderboard, revealing insights into challenges, model size effects, and ranking dynamics.
Contribution
It extends prior research by providing a comprehensive, long-term analysis of Korean LLM development, addressing limitations of shorter observation periods.
Findings
Identifies key challenges in improving LLM performance over time
Shows how model size correlates with task performance across benchmarks
Reveals shifts in leaderboard rankings patterns over eleven months
Abstract
This paper conducts a longitudinal study over eleven months to address the limitations of prior research on the Open Ko-LLM Leaderboard, which have relied on empirical studies with restricted observation periods of only five months. By extending the analysis duration, we aim to provide a more comprehensive understanding of the progression in developing Korean large language models (LLMs). Our study is guided by three primary research questions: (1) What are the specific challenges in improving LLM performance across diverse tasks on the Open Ko-LLM Leaderboard over time? (2) How does model size impact task performance correlations across various benchmarks? (3) How have the patterns in leaderboard rankings shifted over time on the Open Ko-LLM Leaderboard?. By analyzing 1,769 models over this period, our research offers a comprehensive examination of the ongoing advancements in LLMs and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsERP Systems Implementation and Impact · Private Equity and Venture Capital · FinTech, Crowdfunding, Digital Finance
