SAGE: Hierarchical LLM-Based Literary Evaluation through Ontology-Grounded Interpretive Dimensions
Tianyu Wang, Nianjun Zhou

TL;DR
This paper introduces SAGE, a hierarchical LLM-based framework for evaluating literary quality across interpretive dimensions, demonstrating high reliability and genre discrimination on diverse narratives.
Contribution
The paper presents a novel ontology-grounded, multi-layered evaluation framework using structured LLM assessments, achieving high agreement and discriminative power.
Findings
98.8% score convergence across evaluations
Greater than 94% inter-rater agreement
Clear genre hierarchy with significant effect sizes
Abstract
Evaluating literary quality requires assessing interpretive dimensions such as cultural representation, emotional depth, and philosophical sophistication that resist straightforward computational measurement. We introduce SAGE, a hierarchical evaluation framework that decomposes literary quality into ontology-grounded interpretive dimensions assessed through structured large language model evaluation with multi-round iterative reflection and independent validation. We validate the framework on 100 short stories (50 canonical works, 30 pulp fiction, 20 LLM-generated narratives) across three analytical layers (cultural, emotional-psychological, existential-philosophical) using dual-mode assessment. Across 600 evaluations, the framework achieves 98.8% score convergence and greater than 94% inter-rater agreement, with near-perfect mode invariance between content-based and metadata-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
