Loading paper
When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signals | Tomesphere