Scaling Laws for Precision in High-Dimensional Linear Regression

Dechen Zhang; Xuan Tang; Yingyu Liang; Difan Zou

arXiv:2602.19241·stat.ML·February 27, 2026

Scaling Laws for Precision in High-Dimensional Linear Regression

Dechen Zhang, Xuan Tang, Yingyu Liang, Difan Zou

PDF

Open Access

TL;DR

This paper develops a theoretical framework for understanding how low-precision quantization affects high-dimensional linear regression, revealing different impacts of multiplicative and additive quantization on model and data capacities.

Contribution

It introduces a theoretical analysis of scaling laws for low-precision training, distinguishing the effects of multiplicative and additive quantization on model and data sizes.

Findings

01

Both quantization schemes introduce additive errors and reduce effective data size.

02

Multiplicative quantization preserves full-precision model size, while additive reduces effective model size.

03

Numerical experiments confirm the theoretical predictions.

Abstract

Low-precision training is critical for optimizing the trade-off between model quality and training costs, necessitating the joint allocation of model size, dataset size, and numerical precision. While empirical scaling laws suggest that quantization impacts effective model and data capacities or acts as an additive error, the theoretical mechanisms governing these effects remain largely unexplored. In this work, we initiate a theoretical study of scaling laws for low-precision training within a high-dimensional sketched linear regression framework. By analyzing multiplicative (signal-dependent) and additive (signal-independent) quantization, we identify a critical dichotomy in their scaling behaviors. Our analysis reveals that while both schemes introduce an additive error and degrade the effective data size, they exhibit distinct effects on effective model size: multiplicative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Machine Learning and Data Classification