Loading paper
Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization | Tomesphere