Loading paper
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines | Tomesphere