Loading paper
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs | Tomesphere