Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish, John Kirchenbauer, David Yu Miller, Siddharth Singh, Abhinav Bhatele, Micah Goldblum, Ashwinee Panda, Tom Goldstein

TL;DR
This paper introduces Gemstones, a comprehensive dataset of transformer models with diverse architectures and hyperparameters, to improve understanding of scaling laws and their sensitivity to experimental choices.
Contribution
The authors release Gemstones, an extensive open-source dataset of over 4000 transformer checkpoints, enabling more nuanced analysis of scaling laws across different architectures and hyperparameters.
Findings
Scaling law prescriptions are highly sensitive to experimental design.
Diverse model architectures reveal complex relationships in scaling behaviors.
Hyperparameter choices significantly impact scaling law outcomes.
Abstract
Scaling laws are typically fit using a family of models with a narrow range of frozen hyperparameter choices. In this work we study scaling laws using multiple architectural shapes and hyperparameter choices, highlighting their impact on resulting prescriptions. As a primary artifact of our research, we release the Gemstones: an open-source scaling law dataset, consisting of over 4000 checkpoints from transformers with up to 2 billion parameters and diverse architectural shapes; including ablations over learning rate and cooldown. Our checkpoints enable more complex studies of scaling, such as analyzing the relationship between width and depth. By examining our model suite, we find that the prescriptions of scaling laws can be highly sensitive to the experimental design process and the specific model checkpoints used during fitting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMineralogy and Gemology Studies · Geology and Paleoclimatology Research · Paleontology and Stratigraphy of Fossils
