Loading paper
Prune-OPD: Efficient and Reliable On-Policy Distillation for Long-Horizon Reasoning | Tomesphere