Loading paper
SAPO: Step-Aligned Policy Optimization for Reasoning-Based Generative Recommendation | Tomesphere