LIDS: LLM Summary Inference Under the Layered Lens
Dylan Park, Yingying Fan, and Jinchi Lv

TL;DR
This paper introduces LIDS, a novel method for evaluating LLM-generated summaries using BERT-SVD-based metrics and SOFARI, providing interpretable key themes and improved accuracy in assessing summary quality.
Contribution
The paper proposes a new layered inference method combining BERT-SVD metrics and SOFARI for better summary evaluation and interpretability in NLP.
Findings
LIDS effectively measures summary accuracy with high correlation to human judgment.
LIDS uncovers key thematic words with controlled false discovery rate.
Empirical results show robustness across different LLMs and datasets.
Abstract
Large language models (LLMs) have gained significant attention by many researchers and practitioners in natural language processing (NLP) since the introduction of ChatGPT in 2022. One notable feature of ChatGPT is its ability to generate summaries based on prompts. Yet evaluating the quality of these summaries remains challenging due to the complexity of language. To this end, in this paper we suggest a new method of LLM summary inference with BERT-SVD-based direction metric and SOFARI (LIDS) that assesses the summary accuracy equipped with interpretable key words for layered themes. The LIDS uses a latent SVD-based direction metric to measure the similarity between the summaries and original text, leveraging the BERT embeddings and repeated prompts to quantify the statistical uncertainty. As a result, LIDS gives a natural embedding of each summary for large text reduction. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Artificial Intelligence in Healthcare and Education
