Contrastive Language-Image Pre-training for the Italian Language

Federico Bianchi; Giuseppe Attanasio; Raphael Pisoni; Silvia Terragni,; Gabriele Sarti; Sri Lakshmi

arXiv:2108.08688·cs.CL·August 20, 2021·21 cites

Contrastive Language-Image Pre-training for the Italian Language

Federico Bianchi, Giuseppe Attanasio, Raphael Pisoni, Silvia Terragni,, Gabriele Sarti, Sri Lakshmi

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces CLIP-Italian, a contrastive learning model for Italian that leverages 1.4 million image-text pairs, outperforming multilingual CLIP in image retrieval and zero-shot classification tasks.

Contribution

First CLIP model tailored for Italian, trained on a large dataset, demonstrating superior performance over multilingual models in specific tasks.

Findings

01

CLIP-Italian outperforms multilingual CLIP in image retrieval.

02

CLIP-Italian achieves better zero-shot classification results.

03

The model is trained on 1.4 million image-text pairs.

Abstract

CLIP (Contrastive Language-Image Pre-training) is a very recent multi-modal model that jointly learns representations of images and texts. The model is trained on a massive amount of English data and shows impressive performance on zero-shot classification tasks. Training the same model on a different language is not trivial, since data in other languages might be not enough and the model needs high-quality translations of the texts to guarantee a good performance. In this paper, we present the first CLIP model for the Italian Language (CLIP-Italian), trained on more than 1.4 million image-text pairs. Results show that CLIP-Italian outperforms the multilingual CLIP model on the tasks of image retrieval and zero-shot classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clip-italian/clip-italian
jaxOfficial

Models

🤗
clip-italian/clip-italian
model· 533 dl· ♡ 16
533 dl♡ 16

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLinguistic Studies and Language Acquisition · Natural Language Processing Techniques · Second Language Learning and Teaching

MethodsLinear Layer · Contrastive Language-Image Pre-training · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Dropout · Byte Pair Encoding · Adam · Dense Connections