PEARL: Prototype-Enhanced Alignment for Label-Efficient Representation Learning with Deployment-Driven Insights from Digital Governance Communication Systems
Ruiyu Zhang, Lin Nie, Wai-Fung Lam, Qihao Wang, Xin Zhao

TL;DR
PEARL is a novel method that improves the alignment of embeddings for label-efficient retrieval in digital governance systems, especially under scarce label conditions, by softly aligning embeddings to class prototypes.
Contribution
PEARL introduces a prototype-enhanced alignment technique that improves embedding neighborhood quality with limited supervision, bridging the gap between unsupervised and supervised methods.
Findings
PEARL achieves 25.7% gains over raw embeddings in neighborhood quality.
PEARL outperforms unsupervised post-processing by over 21.1%.
Significant improvements are observed under extreme label scarcity.
Abstract
In many deployed systems, new text inputs are handled by retrieving similar past cases, for example when routing and responding to citizen messages in digital governance platforms. When these systems fail, the problem is often not the language model itself, but that the nearest neighbors in the embedding space correspond to the wrong cases. Modern machine learning systems increasingly rely on fixed, high-dimensional embeddings produced by large pretrained models and sentence encoders. In real-world deployments, labels are scarce, domains shift over time, and retraining the base encoder is expensive or infeasible. As a result, downstream performance depends heavily on embedding geometry. Yet raw embeddings are often poorly aligned with the local neighborhood structure required by nearest-neighbor retrieval, similarity search, and lightweight classifiers that operate directly on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Text and Document Classification Technologies
