Loading paper
EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models | Tomesphere