Loading paper
Cross-Modal Attribute Insertions for Assessing the Robustness of Vision-and-Language Learning | Tomesphere