Loading paper
Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient | Tomesphere