2D Matryoshka Sentence Embeddings
Xianming Li, Zongxi Li, Jing Li, Haoran Xie, Qing Li

TL;DR
This paper introduces 2D Matryoshka Sentence Embeddings, a flexible model that adaptively adjusts embedding sizes and Transformer layers, improving efficiency and applicability across diverse NLP tasks.
Contribution
It proposes a novel 2D embedding model supporting elastic sizes and layers, enhancing flexibility and efficiency over previous fixed-layer, fixed-size methods.
Findings
Achieves comparable accuracy with smaller embeddings
Supports dynamic adjustment of layers and sizes
Demonstrates effectiveness on STS and downstream tasks
Abstract
Common approaches rely on fixed-length embedding vectors from language models as sentence embeddings for downstream tasks such as semantic textual similarity (STS). Such methods are limited in their flexibility due to unknown computational constraints and budgets across various applications. Matryoshka Representation Learning (MRL) \cite{aditya2022matryoshka} encodes information at finer granularities, i.e., with lower embedding dimensions, to adaptively accommodate \emph{ad hoc} tasks. Similar accuracy can be achieved with a smaller embedding size, leading to speedups in downstream tasks. Despite its improved efficiency, MRL still requires traversing all Transformer layers before obtaining the embedding, which remains the dominant factor in time and memory consumption. This prompts consideration of whether the fixed number of Transformer layers affects representation quality and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗mixedbread-ai/mxbai-embed-2d-large-v1model· 1.8k dl· ♡ 411.8k dl♡ 41
- 🤗tomaarsen/distilroberta-base-nli-adaptive-layermodel· 1 dl1 dl
- 🤗tomaarsen/distilbert-base-uncased-sts-adaptive-layermodel· 1 dl1 dl
- 🤗tomaarsen/distilroberta-base-nli-2d-matryoshkamodel· 1 dl1 dl
- 🤗tomaarsen/distilbert-base-uncased-sts-2d-matryoshkamodel· 2 dl2 dl
- 🤗bobox/DeBERTaV3-small-SentenceTransformer-AdaptiveLayerBaselinemodel
- 🤗bobox/DeBERTaV3-small-SentenceTransformer-AdaptiveLayerAllmodel· 2 dl· ♡ 12 dl♡ 1
- 🤗bobox/DeBERTaV3-small-ST-AdaptiveLayerAllNormalizedmodel
- 🤗bobox/DeBERTaV3-small-ST-AdaptiveLayers-ep2model· 2 dl2 dl
- 🤗bobox/DeBERTaV3-small-ST-AdaptiveLayer-Norm-ep2model
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Research in Systems and Signal Processing
MethodsAttention Is All You Need · Attention Dropout · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · BART · Linear Warmup With Linear Decay · BERT · RAG · Linear Layer
