Loading paper
Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment | Tomesphere