Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
Ran Zhang, Jihed Ouni, Steffen Eger

TL;DR
This paper introduces the first dataset and evaluation framework for cross-lingual cross-temporal summarization, comparing transformer models and GPT-3.5, revealing GPT-3.5's relative effectiveness and challenges in summarizing historical texts across languages.
Contribution
It creates a novel CLCTS dataset, evaluates multiple models including GPT-3.5, and analyzes the challenges and performance in summarizing historical, cross-lingual texts.
Findings
GPT-3.5 achieves moderate to good quality summaries.
Intermediate finetuning models perform poorly.
Longer, older texts are more difficult to summarize.
Abstract
While summarization has been extensively researched in natural language processing (NLP), cross-lingual cross-temporal summarization (CLCTS) is a largely unexplored area that has the potential to improve cross-cultural accessibility and understanding. This paper comprehensively addresses the CLCTS task, including dataset creation, modeling, and evaluation. We (1) build the first CLCTS corpus with 328 instances for hDe-En (extended version with 455 instances) and 289 for hEn-De (extended version with 501 instances), leveraging historical fiction texts and Wikipedia summaries in English and German; (2) examine the effectiveness of popular transformer end-to-end models with different intermediate finetuning tasks; (3) explore the potential of GPT-3.5 as a summarizer; (4) report evaluations from humans, GPT-4, and several recent automatic evaluation metrics. Our results indicate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Softmax · {Dispute@FaQ-s}How to file a dispute with Expedia? · Layer Normalization · Weight Decay · Attention Dropout · Linear Layer
