Loading paper
SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning | Tomesphere