Large-Margin Learning of Submodular Summarization Methods

Ruben Sipos; Pannaga Shivaswamy; Thorsten Joachims

arXiv:1110.2162·cs.AI·October 14, 2011·2 cites

Large-Margin Learning of Submodular Summarization Methods

Ruben Sipos, Pannaga Shivaswamy, Thorsten Joachims

PDF

Open Access

TL;DR

This paper introduces a supervised large-margin learning approach for submodular functions in extractive multi-document summarization, improving performance over manually tuned methods by directly optimizing a convex relaxation of the performance measure.

Contribution

It presents a novel large-margin training method applicable to all submodular summarization functions, enabling automatic learning of high-fidelity models with many parameters.

Findings

01

Significant performance improvements over state-of-the-art manually tuned functions

02

Effective for both pairwise and coverage-based scoring functions

03

Applicable across multiple datasets

Abstract

In this paper, we present a supervised learning approach to training submodular scoring functions for extractive multi-document summarization. By taking a structured predicition approach, we provide a large-margin method that directly optimizes a convex relaxation of the desired performance measure. The learning method applies to all submodular summarization methods, and we demonstrate its effectiveness for both pairwise as well as coverage-based scoring functions on multiple datasets. Compared to state-of-the-art functions that were tuned manually, our method significantly improves performance and enables high-fidelity models with numbers of parameters well beyond what could reasonbly be tuned by hand.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques