Loading paper
Spontaneous Reward Hacking in Iterative Self-Refinement | Tomesphere