Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC
Ze Chen, Kangxu Wang, Zijian Cai, Jiewen Zheng, Jiarong He, Max Gao,, Jason Zhang

TL;DR
This paper presents a deep Mixture-of-Experts approach for detecting word meaning shifts in the TempoWiC task, achieving top performance by combining multiple strategies including data augmentation, POS info, and ensemble methods.
Contribution
The paper introduces a novel MoE-based method that effectively integrates context, POS, and semantic features for word sense change detection, improving robustness and accuracy.
Findings
Achieved a macro-F1 score of 77.05% and first place in TempoWiC.
Demonstrated the effectiveness of MoE in combining diverse linguistic features.
Showed that ensemble methods further enhance prediction performance.
Abstract
This paper mainly describes the dma submission to the TempoWiC task, which achieves a macro-F1 score of 77.05% and attains the first place in this task. We first explore the impact of different pre-trained language models. Then we adopt data cleaning, data augmentation, and adversarial training strategies to enhance the model generalization and robustness. For further improvement, we integrate POS information and word semantic representation using a Mixture-of-Experts (MoE) approach. The experimental results show that MoE can overcome the feature overuse issue and combine the context, POS, and word semantic features well. Additionally, we use a model ensemble method for the final prediction, which has been proven effective by many research works.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsDual Multimodal Attention
