Loading paper
Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation | Tomesphere