Quantifying the Capabilities of LLMs across Scale and Precision

Sher Badshah; Hassan Sajjad

arXiv:2405.03146·cs.LG·May 9, 2024·2 cites

Quantifying the Capabilities of LLMs across Scale and Precision

Sher Badshah, Hassan Sajjad

PDF

Open Access

TL;DR

This paper evaluates how model size and quantization affect large language models' performance, showing larger models outperform smaller ones and maintain accuracy even at low-precision levels, highlighting scale's importance.

Contribution

It provides a comprehensive analysis of the impact of scale and quantization on open-source LLMs, demonstrating the resilience of larger models to low-precision quantization.

Findings

01

Larger models outperform smaller ones across various tasks.

02

Models maintain high accuracy at 4-bit quantization.

03

Scaling remains crucial for performance enhancement.

Abstract

Scale is often attributed as one of the factors that cause an increase in the performance of LLMs, resulting in models with billion and trillion parameters. One of the limitations of such large models is the high computational requirements that limit their usage, deployment, and debugging in resource-constrained scenarios. Two commonly used alternatives to bypass these limitations are to use the smaller versions of LLMs (e.g. Llama 7B instead of Llama 70B) and lower the memory requirements by using quantization. While these approaches effectively address the limitation of resources, their impact on model performance needs thorough examination. In this study, we perform a comprehensive evaluation to investigate the effect of model scale and quantization on the performance. We experiment with two major families of open-source instruct models ranging from 7 billion to 70 billion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Semantic Web and Ontologies

MethodsLLaMA