Loading paper
Joint Reward Modeling: Internalizing Chain-of-Thought for Efficient Visual Reward Models | Tomesphere