Loading paper
Revealing the Gap in Human and VLM Scene Perception through Counterfactual Semantic Saliency | Tomesphere