LM-Lexicon: Improving Definition Modeling via Harmonizing Semantic Experts
Yang Liu, Jiaye Yang, Weikang Li, Jiahui Liang, Yang Li, Lingyong Yan

TL;DR
LM-Lexicon introduces a semantic expert-based approach for definition modeling, leveraging clustering and model merging to significantly improve performance on benchmark datasets.
Contribution
It proposes a novel sparse mixture-of-experts architecture with semantic domain specialization, achieving state-of-the-art results in definition modeling.
Findings
Clustering enables fine-grained expert specialization with 10% better definition quality.
Semantic-aware routing outperforms token-level routing by 1%.
Scaling experts and compute at test time further improves performance.
Abstract
We introduce LM-Lexicon, an innovative definition modeling approach that incorporates data clustering, semantic expert learning, and model merging using a sparse mixture-of-experts architecture. By decomposing the definition modeling task into specialized semantic domains, where small language models are trained as domain experts, LM-Lexicon achieves substantial improvements (+7% BLEU score compared with the prior state-of-the-art model) over existing methods on five widely used benchmarks. Empirically, we demonstrate that 1) the clustering strategy enables fine-grained expert specialization with nearly 10% improvement in definition quality; 2) the semantic-aware domain-level routing mechanism achieves higher expert efficacy (+1%) than conventional token-level routing; and 3) further performance gains can be obtained through test-time compute and semantic expert scaling. Our work…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗LM-Lexicon/LM-Lexicon-4x8B-MoE-Slangmodel
- 🤗LM-Lexicon/LM-Lexicon-4x8B-MoE-Wordnetmodel
- 🤗LM-Lexicon/LM-Lexicon-4x8B-MoE-Oxfordmodel
- 🤗LM-Lexicon/LM-Lexicon-4x8B-MoE-Wikimodel
- 🤗LM-Lexicon/LM-Lexicon-4x8B-MoE-3D-EXmodel
- 🤗LM-Lexicon/LM-Lexicon-8B-Dense-3D-EXmodel· 5 dl5 dl
- 🤗LM-Lexicon/LM-Lexicon-8B-Dense-Slangmodel· 6 dl6 dl
- 🤗LM-Lexicon/LM-Lexicon-8B-Dense-Wikimodel· 3 dl3 dl
- 🤗LM-Lexicon/LM-Lexicon-8B-Dense-Oxfordmodel
- 🤗LM-Lexicon/LM-Lexicon-8B-Dense-Wordnetmodel· 4 dl4 dl
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare
