PerSum: Novel Systems for Document Summarization in Persian

Saeid Parvandeh; Shibamouli Lahiri; Fahimeh Boroumand

arXiv:1606.03143·cs.CL·June 13, 2016·1 cites

PerSum: Novel Systems for Document Summarization in Persian

Saeid Parvandeh, Shibamouli Lahiri, Fahimeh Boroumand

PDF

Open Access

TL;DR

This paper investigates Persian document summarization, comparing graph-based methods and centrality measure-based summarizers, finding certain centrality measures outperform strong baselines in ROUGE evaluations.

Contribution

It introduces and evaluates novel graph-based and centrality measure-based summarization systems specifically for Persian language.

Findings

01

Graph-based methods outperform modified systems in human evaluations.

02

Certain centrality measures yield better ROUGE scores than baseline methods.

03

The study advances Persian summarization with tailored graph and centrality approaches.

Abstract

In this paper we explore the problem of document summarization in Persian language from two distinct angles. In our first approach, we modify a popular and widely cited Persian document summarization framework to see how it works on a realistic corpus of news articles. Human evaluation on generated summaries shows that graph-based methods perform better than the modified systems. We carry this intuition forward in our second approach, and probe deeper into the nature of graph-based systems by designing several summarizers based on centrality measures. Ad hoc evaluation using ROUGE score on these summarizers suggests that there is a small class of centrality measures that perform better than three strong unsupervised baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques