Leveraging Language Models for Interpretable Analysis of Narratives in a Large Corpus

Eric A. Bai; Minling Zhou; Ricardo Henao; Kyle M. Schwing; and Lawrence Carin

arXiv:2511.18599·eess.SP·November 25, 2025

Leveraging Language Models for Interpretable Analysis of Narratives in a Large Corpus

Eric A. Bai, Minling Zhou, Ricardo Henao, Kyle M. Schwing, and Lawrence Carin

PDF

Open Access

TL;DR

This paper introduces an interpretable model combining traditional bag-of-words and novel LLM-based question answering to analyze narratives in large corpora, improving interpretability and efficiency.

Contribution

It presents a new hybrid approach that integrates BoW and LLMs within a shared latent space, enabling scalable and interpretable narrative analysis.

Findings

01

Efficient gradient descent updates analogous to self-attention.

02

Effective Q&A extrapolation for unqueried documents.

03

Improved interpretability and scalability in narrative analysis.

Abstract

Narratives drive human behavior and lay at the core of geopolitics, but have eluded quantification that would permit measurement of their overlap and evolution. We present an interpretable model that integrates an established bag-of-words (BoW) topical representation and a novel LLM-based question answering (Q&A) narrative model, which share a latent Reproducing Kernel Hilbert Space representation, to quantify written documents. Our approach mitigates the cost, interpretability, and generalization challenges of using a LLM to analyze large corpora without full inference. We derive efficient functional gradient descent updates that are interpretable and structurally analogous to the self-attention mechanism in Transformers. We further introduce an in-context Q&A extrapolation method inspired by Transformer architectures, enabling accurate prediction of Q&A outcomes for unqueried…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Multimodal Machine Learning Applications