Towards Interpretable Summary Evaluation via Allocation of Contextual Embeddings to Reference Text Topics
Ben Schaper, Christopher Lohse, Marcell Streile, Andrea Giovannini,, Richard Osuala

TL;DR
This paper introduces MISEM, a new interpretable method for evaluating summaries by allocating contextual embeddings to reference text topics, enhancing transparency and qualitative analysis.
Contribution
The paper presents MISEM, a novel evaluation approach that assigns summary embeddings to reference topics, along with an interpretability toolbox for automated and interactive analysis.
Findings
MISEM achieves a .404 Pearson correlation with human judgment.
The interpretability toolbox enables transparent assessment of summaries.
Allocation of embeddings improves qualitative understanding of summary quality.
Abstract
Despite extensive recent advances in summary generation models, evaluation of auto-generated summaries still widely relies on single-score systems insufficient for transparent assessment and in-depth qualitative analysis. Towards bridging this gap, we propose the multifaceted interpretable summary evaluation method (MISEM), which is based on allocation of a summary's contextual token embeddings to semantic topics identified in the reference text. We further contribute an interpretability toolbox for automated summary evaluation and interactive visual analysis of summary scoring, topic identification, and token-topic allocation. MISEM achieves a promising .404 Pearson correlation with human judgment on the TAC'08 dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
