SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing
Sheng Li, Geng Yuan, Yue Dai, Youtao Zhang, Yanzhi Wang, Xulong Tang

TL;DR
SmartFRZ introduces an attention-guided layer freezing framework that automatically selects layers to freeze during training, significantly reducing computation and training time while maintaining accuracy across different neural network architectures.
Contribution
It proposes a novel, generic, and automatic layer freezing method using attention mechanisms, overcoming limitations of manual and heuristic approaches.
Findings
Reduces training computation and time significantly.
Maintains high accuracy with automatic layer freezing.
Outperforms existing layer freezing methods.
Abstract
There has been a proliferation of artificial intelligence applications, where model training is key to promising high-quality services for these applications. However, the model training process is both time-intensive and energy-intensive, inevitably affecting the user's demand for application efficiency. Layer freezing, an efficient model training technique, has been proposed to improve training efficiency. Although existing layer freezing methods demonstrate the great potential to reduce model training costs, they still remain shortcomings such as lacking generalizability and compromised accuracy. For instance, existing layer freezing methods either require the freeze configurations to be manually defined before training, which does not apply to different networks, or use heuristic freezing criteria that is hard to guarantee decent accuracy in different scenarios. Therefore, there…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeural Networks and Applications · Evolutionary Algorithms and Applications
