Loading paper
CAST: Cross-modal Alignment Similarity Test for Vision Language Models | Tomesphere