Loading paper
MVP-Bench: Can Large Vision--Language Models Conduct Multi-level Visual Perception Like Humans? | Tomesphere