ConVerSum: A Contrastive Learning-based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents
Sanzana Karim Lora, M. Sohel Rahman, Rifat Shahriyar

TL;DR
ConVerSum introduces a contrastive learning-based method for cross-lingual summarization that is effective even with limited data, outperforming large language models in low-resource language scenarios.
Contribution
The paper presents a novel contrastive learning approach for data-efficient cross-lingual summarization, addressing the lack of high-quality CLS data and improving performance over existing methods.
Findings
Outperforms current methodologies in low-resource CLS tasks
Achieves better results than GPT-3.5 and GPT-4 on low-resource languages
Demonstrates significant improvement in data-scarce cross-lingual summarization
Abstract
Cross-lingual summarization (CLS) is a sophisticated branch in Natural Language Processing that demands models to accurately translate and summarize articles from different source languages. Despite the improvement of the subsequent studies, This area still needs data-efficient solutions along with effective training methodologies. To the best of our knowledge, there is no feasible solution for CLS when there is no available high-quality CLS data. In this paper, we propose a novel data-efficient approach, ConVerSum, for CLS leveraging the power of contrastive learning, generating versatile candidate summaries in different languages based on the given source document and contrasting these summaries with reference summaries concerning the given documents. After that, we train the model with a contrastive ranking loss. Then, we rigorously evaluate the proposed approach against current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text and Document Classification Technologies · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Softmax · Linear Layer · Attention Dropout · Dropout · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Residual Connection · Layer Normalization
