Loading paper
Reward Hacking as Equilibrium under Finite Evaluation | Tomesphere