Loading paper
How Visual Representations Map to Language Feature Space in Multimodal LLMs | Tomesphere