SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts
Shaobo Cui, Xintong Bao, Xuming Lin, Zhongzhou Zhao, Ji Zhang, Wei, Zhou, Haiqing Chen

TL;DR
This paper introduces SPMoE, a novel model that decomposes one-to-many language generation into multiple pattern-specific experts, improving explainability and diversity in outputs.
Contribution
The paper proposes SPMoE, a sparse mixture of experts model that uses linguistic patterns to enhance controllability and diversity in language generation tasks.
Findings
SPMoE achieves a good balance of quality and diversity.
The model improves explainability by associating experts with specific patterns.
Experimental results on paraphrase generation validate the effectiveness of SPMoE.
Abstract
Many generation tasks follow a one-to-many mapping relationship: each input could be associated with multiple outputs. Existing methods like Conditional Variational AutoEncoder(CVAE) employ a latent variable to model this one-to-many relationship. However, this high-dimensional and dense latent variable lacks explainability and usually leads to poor and uncontrollable generations. In this paper, we innovatively introduce the linguistic concept of pattern to decompose the one-to-many mapping into multiple one-to-one mappings and further propose a model named Sparse Pattern Mixture of Experts(SPMoE). Each one-to-one mapping is associated with a conditional generation pattern and is modeled with an expert in SPMoE. To ensure each language pattern can be exclusively handled with an expert model for better explainability and diversity, a sparse mechanism is employed to coordinate all the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
