Loading paper
Do Vision-Language Models Truly Perform Vision Reasoning? A Rigorous Study of the Modality Gap | Tomesphere