Leveraging Language Models for Interpretable Analysis of Narratives in a Large Corpus
Eric A. Bai, Minling Zhou, Ricardo Henao, Kyle M. Schwing, and Lawrence Carin

TL;DR
This paper introduces an interpretable model combining traditional bag-of-words and novel LLM-based question answering to analyze narratives in large corpora, improving interpretability and efficiency.
Contribution
It presents a new hybrid approach that integrates BoW and LLMs within a shared latent space, enabling scalable and interpretable narrative analysis.
Findings
Efficient gradient descent updates analogous to self-attention.
Effective Q&A extrapolation for unqueried documents.
Improved interpretability and scalability in narrative analysis.
Abstract
Narratives drive human behavior and lay at the core of geopolitics, but have eluded quantification that would permit measurement of their overlap and evolution. We present an interpretable model that integrates an established bag-of-words (BoW) topical representation and a novel LLM-based question answering (Q&A) narrative model, which share a latent Reproducing Kernel Hilbert Space representation, to quantify written documents. Our approach mitigates the cost, interpretability, and generalization challenges of using a LLM to analyze large corpora without full inference. We derive efficient functional gradient descent updates that are interpretable and structurally analogous to the self-attention mechanism in Transformers. We further introduce an in-context Q&A extrapolation method inspired by Transformer architectures, enabling accurate prediction of Q&A outcomes for unqueried…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Multimodal Machine Learning Applications
