Loading paper
CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using Agent | Tomesphere