Cultural Analytics for Good: Building Inclusive Evaluation Frameworks for Historical IR
Suchana Datta, Dwaipayan Roy, Derek Greene, Gerardine Meaney, Karen Wade, Philipp Mayr

TL;DR
This paper develops an inclusive evaluation framework combining expert input, LLM assistance, and cultural analytics to improve historical IR, focusing on 19th-century British literature and fostering equitable access to digital archives.
Contribution
It introduces a novel interdisciplinary approach integrating expert-driven queries, relevance annotation, and LLMs to create scalable, culturally inclusive IR evaluation benchmarks for historical collections.
Findings
Enhanced retrieval accuracy for 19th-century texts
Improved interpretability and transparency in IR systems
Fostered cultural inclusivity in digital archives
Abstract
This work bridges the fields of information retrieval and cultural analytics to support equitable access to historical knowledge. Using the British Library BL19 digital collection (more than 35,000 works from 1700-1899), we construct a benchmark for studying changes in language, terminology and retrieval in the 19th-century fiction and non-fiction. Our approach combines expert-driven query design, paragraph-level relevance annotation, and Large Language Model (LLM) assistance to create a scalable evaluation framework grounded in human expertise. We focus on knowledge transfer from fiction to non-fiction, investigating how narrative understanding and semantic richness in fiction can improve retrieval for scholarly and factual materials. This interdisciplinary framework not only improves retrieval accuracy but also fosters interpretability, transparency, and cultural inclusivity in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Computational and Text Analysis Methods · Digital Humanities and Scholarship
