Loading paper
APR: Penalizing Structural Redundancy in Large Reasoning Models via Anchor-based Process Rewards | Tomesphere