Loading paper
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text | Tomesphere