Evergreen: Efficient Claim Verification for Semantic Aggregates

Alexander W. Lee; Benjamin Han; Shayak Sen; Sam Yeom; Ugur Cetintemel; Anupam Datta

arXiv:2604.26180·cs.DB·April 30, 2026

Evergreen: Efficient Claim Verification for Semantic Aggregates

Alexander W. Lee, Benjamin Han, Shayak Sen, Sam Yeom, Ugur Cetintemel, Anupam Datta

PDF

TL;DR

Evergreen is a system that efficiently verifies claims in semantic aggregates by optimizing LLM usage, reducing costs and latency while maintaining high accuracy on real-world datasets.

Contribution

It introduces a novel approach to claim verification as a semantic query processing task with tailored optimizations and provenance capture, improving efficiency and accuracy.

Findings

01

Achieves F1 = 1.00 with a strong LLM, reducing cost by 3.2x and latency by 4.0x.

02

Outperforms LLM-as-a-judge baseline in F1 at 48x lower cost and 2.3x lower latency.

03

Matches F1 with a weaker LLM at 63x lower cost and 4.2x lower latency.

Abstract

With recent semantic query processing engines, semantic aggregation has become a primitive operator, enabling the reduction of a relation into a natural language aggregate using an LLM. However, the resulting semantic aggregate may contain claims that are not grounded in the underlying relation. Verifying such claims is challenging: they often involve quantifiers, groupings, and comparisons over relations that far exceed LLM context windows and require a costly combination of semantic and symbolic processing. We present Evergreen, a system that recasts claim verification as a semantic query processing task with tailored optimizations and provenance capture. Evergreen compiles each claim into a declarative semantic verification query and executes it on the same engine that produced the aggregate. To reduce cost and latency, Evergreen avoids unnecessary LLM calls through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.