Research on Personalized Compression Algorithm for Pre-trained Models   Based on Homomorphic Entropy Increase

Yicong Li; Xing Guo; Haohua Du

arXiv:2408.08684·cs.LG·August 19, 2024

Research on Personalized Compression Algorithm for Pre-trained Models Based on Homomorphic Entropy Increase

Yicong Li, Xing Guo, Haohua Du

PDF

Open Access

TL;DR

This paper proposes a layered pruning strategy using compressed sensing and random sampling to reduce model size for personalized AI models, enhancing deployment on mobile devices.

Contribution

It introduces a novel layered pruning method that distinguishes personalized layers, improving model efficiency without sacrificing accuracy.

Findings

01

Significant reduction in model parameters achieved.

02

Step buffering mechanism improves post-pruning accuracy.

03

Enables efficient deployment of personalized models on mobile devices.

Abstract

In this article, we explore the challenges and evolution of two key technologies in the current field of AI: Vision Transformer model and Large Language Model (LLM). Vision Transformer captures global information by splitting images into small pieces and leveraging Transformer's multi-head attention mechanism, but its high reference count and compute overhead limit deployment on mobile devices. At the same time, the rapid development of LLM has revolutionized natural language processing, but it also faces huge deployment challenges. To address these issues, we investigate model pruning techniques, with a particular focus on how to reduce redundant parameters without losing accuracy to accommodate personalized data and resource-constrained environments. In this paper, a new layered pruning strategy is proposed to distinguish the personalized layer from the common layer by compressed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Algorithms and Applications

MethodsResidual Connection · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Attention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Vision Transformer · Pruning · Softmax