LEALLA: Learning Lightweight Language-agnostic Sentence Embeddings with Knowledge Distillation
Zhuoyuan Mao, Tetsuji Nakagawa

TL;DR
This paper introduces LEALLA, a lightweight, language-agnostic sentence embedding model that uses knowledge distillation to achieve competitive performance across 109 languages while reducing inference overhead.
Contribution
The paper proposes a novel lightweight model architecture and distillation techniques for efficient, multilingual sentence embeddings, outperforming existing large models in speed and resource usage.
Findings
LEALLA achieves strong performance on multiple benchmarks.
The lightweight model significantly reduces inference time.
Knowledge distillation improves embedding quality.
Abstract
Large-scale language-agnostic sentence embedding models such as LaBSE (Feng et al., 2022) obtain state-of-the-art performance for parallel sentence alignment. However, these large-scale models can suffer from inference speed and computation overhead. This study systematically explores learning language-agnostic sentence embeddings with lightweight models. We demonstrate that a thin-deep encoder can construct robust low-dimensional sentence embeddings for 109 languages. With our proposed distillation methods, we achieve further improvements by incorporating knowledge from a teacher model. Empirical results on Tatoeba, United Nations, and BUCC show the effectiveness of our lightweight models. We release our lightweight language-agnostic sentence embedding models LEALLA on TensorFlow Hub.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
