Loading paper
Evaluation of Multilingual Image Captioning: How far can we get with CLIP models? | Tomesphere