Flexible and Adaptable Summarization via Expertise Separation
Xiuying Chen, Mingzhe Li, Shen Gao, Xin Cheng, Qingqing Zhu, Rui Yan,, Xin Gao, Xiangliang Zhang

TL;DR
This paper introduces MoeSumm, a parameter-efficient Mixture-of-Expert architecture that separates general and domain-specific summarization abilities, enhancing flexibility and adaptability across multiple tasks and domains.
Contribution
The paper proposes MoeSumm, a novel mixture-of-experts model with a max-margin loss to explicitly separate general and domain-specific summarization skills, improving multi-domain performance.
Findings
Outperforms recent baselines and LLMs on 11 datasets
Demonstrates effective separation of general and domain-specific abilities
Maintains parameter efficiency while enhancing flexibility and adaptability
Abstract
A proficient summarization model should exhibit both flexibility -- the capacity to handle a range of in-domain summarization tasks, and adaptability -- the competence to acquire new knowledge and adjust to unseen out-of-domain tasks. Unlike large language models (LLMs) that achieve this through parameter scaling, we propose a more parameter-efficient approach in this study. Our motivation rests on the principle that the general summarization ability to capture salient information can be shared across different tasks, while the domain-specific summarization abilities need to be distinct and tailored. Concretely, we propose MoeSumm, a Mixture-of-Expert Summarization architecture, which utilizes a main expert for gaining the general summarization capability and deputy experts that selectively collaborate to meet specific summarization task requirements. We further propose a max-margin…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Advanced Text Analysis Techniques
