TL;DR
MoEITS is a novel, theoretically grounded algorithm that simplifies Mixture-of-Experts large language models, reducing computational costs while maintaining high accuracy, outperforming existing pruning methods.
Contribution
Introduces MoEITS, a simplified, information-theoretic approach for MoE-LLMs that enhances efficiency and effectiveness over current state-of-the-art pruning techniques.
Findings
MoEITS outperforms existing pruning methods on multiple benchmarks.
The algorithm achieves significant reduction in model complexity and energy consumption.
Empirical results confirm the effectiveness of MoEITS across various large language models.
Abstract
Large language models are transforming all areas of academia and industry, attracting the attention of researchers, professionals, and the general public. In the trek for more powerful architectures, Mixture-of-Experts, inspired by ensemble models, have emerged as one of the most effective ways to follow. However, this implies a high computational burden for both training and inference. To reduce the impact on computing and memory footprint as well as the energy consumption, simplification methods has arisen as very effective procedures. In this paper, an original algorithm, MoEITS, for MoE-LLMs simplification is presented. The algorithm is characterized by a refined simplicity, underpinned by standardized Information Theoretic frameworks. MoEITS is analyzed in depth from theoretical and practical points of view. Its computational complexity is studied. Its performance on the accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
