A Latent Variable Framework for Scaling Laws in Large Language Models

Peiyao Cai; Chengyu Cui; Felipe Maia Polo; Seamus Somerstep; Leshem Choshen; Mikhail Yurochkin; Moulinath Banerjee; Yuekai Sun; Kean Ming Tan; Gongjun Xu

arXiv:2512.06553·stat.AP·December 9, 2025

A Latent Variable Framework for Scaling Laws in Large Language Models

Peiyao Cai, Chengyu Cui, Felipe Maia Polo, Seamus Somerstep, Leshem Choshen, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun, Kean Ming Tan, Gongjun Xu

PDF

Open Access

TL;DR

This paper introduces a latent variable modeling framework to better understand and predict the performance scaling laws of diverse large language models across multiple benchmarks, addressing heterogeneity issues.

Contribution

It presents a novel latent variable approach that captures common features within LLM families and models their benchmark performance, with an estimation procedure and empirical validation.

Findings

01

Effective modeling of performance across diverse LLMs

02

Supports estimation and downstream tasks

03

Validated on 12 benchmarks from Open LLM Leaderboard

Abstract

We propose a statistical framework built on latent variable modeling for scaling laws of large language models (LLMs). Our work is motivated by the rapid emergence of numerous new LLM families with distinct architectures and training strategies, evaluated on an increasing number of benchmarks. This heterogeneity makes a single global scaling curve inadequate for capturing how performance varies across families and benchmarks. To address this, we propose a latent variable modeling framework in which each LLM family is associated with a latent variable that captures the common underlying features in that family. An LLM's performance on different benchmarks is then driven by its latent skills, which are jointly determined by the latent variable and the model's own observable features. We develop an estimation procedure for this latent variable model and establish its statistical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification