MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining

Phung Gia Huy; Hai An Vu; Minh-Phuc Truong; Thang Duc Tran; Linh Ngo Van; Thanh Hong Nguyen; Trung Le

arXiv:2604.24374·cs.CL·April 28, 2026

MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining

Phung Gia Huy, Hai An Vu, Minh-Phuc Truong, Thang Duc Tran, Linh Ngo Van, Thanh Hong Nguyen, Trung Le

PDF

TL;DR

MIPIC introduces a unified training framework for nested, structurally coherent embeddings in NLP, enhancing performance across various model sizes and low-dimensional settings.

Contribution

It proposes MIPIC, a novel method combining self-distillation and progressive information chaining to learn Matryoshka representations with structural and semantic coherence.

Findings

01

MIPIC improves representation quality across multiple NLP benchmarks.

02

It achieves significant performance gains in low-dimensional models.

03

MIPIC produces highly competitive embeddings across diverse model capacities.

Abstract

Representation learning is fundamental to NLP, but building embeddings that work well at different computational budgets is challenging. Matryoshka Representation Learning (MRL) offers a flexible inference paradigm through nested embeddings; however, learning such structures requires explicit coordination of how information is arranged across embedding dimensionality and model depth. In this work, we propose MIPIC (Matryoshka Representation Learning via Self-Distilled Intra-Relational Alignment and Progressive Information Chaining), a unified training framework designed to produce structurally coherent and semantically compact Matryoshka representations. MIPIC promotes cross-dimensional structural consistency through Self-Distilled Intra-Relational Alignment (SIA), which aligns token-level geometric and attention-driven relations between full and truncated representations using top-k…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.