Spectra: Surprising Effectiveness of Pretraining Ternary Language Models   at Scale

Ayush Kaushal; Tejas Vaidhya; Arnab Kumar Mondal; Tejas Pandey; Aaryan; Bhagat; Irina Rish

arXiv:2407.12327·cs.LG·October 14, 2024·1 cites

Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scale

Ayush Kaushal, Tejas Vaidhya, Arnab Kumar Mondal, Tejas Pandey, Aaryan, Bhagat, Irina Rish

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that ternary language models (TriLMs) trained at scale can outperform traditional floating-point and quantized models of similar bit-widths, offering a promising path for more efficient large language models.

Contribution

It introduces the Spectra LLM suite, the first open collection of models across multiple bit-widths, and shows that TriLMs outperform quantized and float models at large scales, challenging existing assumptions.

Findings

01

TriLMs outperform QuantLMs and FloatLMs at scales over one billion parameters.

02

The 3.9B TriLM matches the performance of the 3.9B FloatLM despite fewer bits.

03

Open release of 500+ checkpoints facilitates further research.

Abstract

Rapid advancements in GPU computational power has outpaced memory capacity and bandwidth growth, creating bottlenecks in Large Language Model (LLM) inference. Post-training quantization is the leading method for addressing memory-related bottlenecks in LLM inference, but it suffers from significant performance degradation below 4-bit precision. This paper addresses these challenges by investigating the pretraining of low-bitwidth models specifically Ternary Language Models (TriLMs) as an alternative to traditional floating-point models (FloatLMs) and their post-training quantized versions (QuantLMs). We present Spectra LLM suite, the first open suite of LLMs spanning multiple bit-widths, including FloatLMs, QuantLMs, and TriLMs, ranging from 99M to 3.9B parameters trained on 300B tokens. Our comprehensive evaluation demonstrates that TriLMs offer superior scaling behavior in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nolanoorg/spectrasuite
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling