Research on Personalized Compression Algorithm for Pre-trained Models Based on Homomorphic Entropy Increase
Yicong Li, Xing Guo, Haohua Du

TL;DR
This paper proposes a layered pruning strategy using compressed sensing and random sampling to reduce model size for personalized AI models, enhancing deployment on mobile devices.
Contribution
It introduces a novel layered pruning method that distinguishes personalized layers, improving model efficiency without sacrificing accuracy.
Findings
Significant reduction in model parameters achieved.
Step buffering mechanism improves post-pruning accuracy.
Enables efficient deployment of personalized models on mobile devices.
Abstract
In this article, we explore the challenges and evolution of two key technologies in the current field of AI: Vision Transformer model and Large Language Model (LLM). Vision Transformer captures global information by splitting images into small pieces and leveraging Transformer's multi-head attention mechanism, but its high reference count and compute overhead limit deployment on mobile devices. At the same time, the rapid development of LLM has revolutionized natural language processing, but it also faces huge deployment challenges. To address these issues, we investigate model pruning techniques, with a particular focus on how to reduce redundant parameters without losing accuracy to accommodate personalized data and resource-constrained environments. In this paper, a new layered pruning strategy is proposed to distinguish the personalized layer from the common layer by compressed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Algorithms and Applications
MethodsResidual Connection · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Attention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Vision Transformer · Pruning · Softmax
