Loading paper
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models | Tomesphere