GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages
Lakshmi Sireesha Vakada, Anudeep Ch, Mounika Marreddy, Subba Reddy, Oota, Radhika Mamidi

TL;DR
GAE-ISumm is an unsupervised graph autoencoder-based model for summarizing Indian languages, effectively handling low-resource challenges and outperforming existing methods across multiple datasets.
Contribution
The paper introduces GAE-ISumm, a novel unsupervised graph autoencoder model for Indian language summarization, along with a new Telugu dataset TELSUM.
Findings
GAE-ISumm outperforms state-of-the-art models on multiple datasets.
It achieves benchmark results on the TELSUM dataset.
Inclusion of positional and cluster information improves summarization performance.
Abstract
Document summarization aims to create a precise and coherent summary of a text document. Many deep learning summarization models are developed mainly for English, often requiring a large training corpus and efficient pre-trained language models and tools. However, English summarization models for low-resource Indian languages are often limited by rich morphological variation, syntax, and semantic differences. In this paper, we propose GAE-ISumm, an unsupervised Indic summarization model that extracts summaries from text documents. In particular, our proposed model, GAE-ISumm uses Graph Autoencoder (GAE) to learn text representations and a document summary jointly. We also provide a manually-annotated Telugu summarization dataset TELSUM, to experiment with our model GAE-ISumm. Further, we experiment with the most publicly available Indian language summarization datasets to investigate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
