Rethinking the Outlier Distribution in Large Language Models: An In-depth Study
Rahul Raman, Khushi Sharma, Sai Qian Zhang

TL;DR
This paper thoroughly investigates the causes of outliers in large language models, such as massive activations and channel-wise outliers, and proposes strategies to mitigate them, improving quantization and deployment efficiency.
Contribution
It provides an in-depth analysis of outlier formation mechanisms and introduces effective methods to eliminate most outliers with minimal accuracy loss.
Findings
Identification of root causes of outliers in LLMs
Development of strategies to mitigate massive activations
Effective removal of outliers with minimal impact on accuracy
Abstract
Investigating outliers in large language models (LLMs) is crucial due to their significant impact on various aspects of LLM performance, including quantization and compression. Outliers often cause considerable quantization errors, leading to degraded model performance. Identifying and addressing these outliers can enhance the accuracy and efficiency of the quantization process, enabling smoother deployment on edge devices or specialized hardware. Recent studies have identified two common types of outliers in LLMs: massive activations and channel-wise outliers. While numerous quantization algorithms have been proposed to mitigate their effects and maintain satisfactory accuracy, few have thoroughly explored the root causes of these outliers in depth. In this paper, we conduct a comprehensive investigation into the formation mechanisms of these outliers and propose potential strategies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
