Using Statistical and Semantic Models for Multi-Document Summarization
Divyanshu Daiya, Anukarsh Singh, Mukesh Jadon

TL;DR
This paper explores combining statistical and semantic models for extractive multi-document summarization, demonstrating that weighted integration and pre-trained vectors improve summary quality on benchmark datasets.
Contribution
It introduces a novel approach of tuning weights between statistical and semantic models, including new semantic models based on GloVe and InferSent, to enhance summarization performance.
Findings
Semantic models improve context understanding over statistical models.
Weighted combination of models yields significant ROUGE score improvements.
Pre-trained vectors further boost summarization accuracy.
Abstract
We report a series of experiments with different semantic models on top of various statistical models for extractive text summarization. Though statistical models may better capture word co-occurrences and distribution around the text, they fail to detect the context and the sense of sentences /words as a whole. Semantic models help us gain better insight into the context of sentences. We show that how tuning weights between different models can help us achieve significant results on various benchmarks. Learning pre-trained vectors used in semantic models further, on given corpus, can give addition spike in performance. Using weighing techniques in between different statistical models too further refines our result. For Statistical models, we have used TF/IDF, TextRAnk, Jaccard/Cosine Similarities. For Semantic Models, we have used WordNet-based Model and proposed two models based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsGloVe Embeddings
