Loading paper
Extracting Multimodal Learngene in CLIP: Unveiling the Multimodal Generalizable Knowledge | Tomesphere