Loading paper
Equivariant Similarity for Vision-Language Foundation Models | Tomesphere