Computational Hermeneutics: Evaluating generative AI as a cultural technology

Cody Kommers; Ruth Ahnert; Maria Antoniak; Emmanouil Benetos; Steve Benford; Mercedes Bunz; Baptiste Caramiaux; Shauna Concannon; Martin Disley; James Dobson; Yali Du; Edgar Du\'e\~nez-Guzm\'an; Kerry Francksen; Evelyn Gius; Jonathan W. Y. Gray; Ryan Heuser; Sarah Immel; Richard Jean So; Sang Leigh; Dalaki Livingston; Hoyt Long; Meredith Martin; Georgia Meyer; Daniela Mihai; Ashley Noel-Hirst; Kirsten Ostherr; Deven Parker; Yipeng Qin; Jessica Ratcliff; Emily Robinson; Karina Rodriguez; Adam Sobey; Ted Underwood; Aditya Vashistha; Matthew Wilkens; Youyou Wu; Yuan Zheng; Drew Hemment

arXiv:2604.16403·cs.AI·April 21, 2026

Computational Hermeneutics: Evaluating generative AI as a cultural technology

Cody Kommers, Ruth Ahnert, Maria Antoniak, Emmanouil Benetos, Steve Benford, Mercedes Bunz, Baptiste Caramiaux, Shauna Concannon, Martin Disley, James Dobson, Yali Du, Edgar Du\'e\~nez-Guzm\'an, Kerry Francksen, Evelyn Gius, Jonathan W. Y. Gray, Ryan Heuser, Sarah Immel

PDF

TL;DR

This paper proposes a hermeneutic framework for evaluating generative AI as cultural technologies, emphasizing context, plurality, and ambiguity in interpretive processes.

Contribution

It introduces computational hermeneutics as a new interpretive approach and offers principles for culturally aware AI evaluation.

Findings

01

Benchmarks should be iterative and context-aware.

02

Evaluation should include human perspectives, not just models.

03

Focus on cultural context rather than only output accuracy.

Abstract

Generative AI systems are increasingly recognized as cultural technologies, yet current evaluation frameworks often treat culture as a variable to be measured rather than fundamental to the system's operation. Drawing on hermeneutic theory from the humanities, we argue that GenAI systems function as "context machines" that must inherently address three interpretive challenges: situatedness (meaning only emerges in context), plurality (multiple valid interpretations coexist), and ambiguity (interpretations naturally conflict). We present computational hermeneutics as an emerging framework offering an interpretive account of what GenAI systems do, and how they might do it better. We offer three principles for hermeneutic evaluation -- that benchmarks should be iterative, not one-off; include people, not just machines; and measure cultural context, not just model output. This perspective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.