Loading paper
S2H-DPO: Hardness-Aware Preference Optimization for Vision-Language Models | Tomesphere