AgentEval: Generative Agents as Reliable Proxies for Human Evaluation of AI-Generated Content

Thanh Vu; Richi Nayak; and Thiru Balasubramaniam

arXiv:2512.08273·cs.AI·December 10, 2025

AgentEval: Generative Agents as Reliable Proxies for Human Evaluation of AI-Generated Content

Thanh Vu, Richi Nayak, and Thiru Balasubramaniam

PDF

Open Access

TL;DR

This paper presents Generative Agents that reliably simulate human judgment to evaluate AI-generated content, reducing costs and time compared to traditional human assessments.

Contribution

It introduces a novel automated evaluation method using Generative Agents to accurately assess AI content quality, streamlining the evaluation process.

Findings

01

Agents effectively mimic human ratings on content quality

02

Evaluation process is faster and more cost-efficient

03

Supports improved content generation for business use

Abstract

Modern businesses are increasingly challenged by the time and expense required to generate and assess high-quality content. Human writers face time constraints, and extrinsic evaluations can be costly. While Large Language Models (LLMs) offer potential in content creation, concerns about the quality of AI-generated content persist. Traditional evaluation methods, like human surveys, further add operational costs, highlighting the need for efficient, automated solutions. This research introduces Generative Agents as a means to tackle these challenges. These agents can rapidly and cost-effectively evaluate AI-generated content, simulating human judgment by rating aspects such as coherence, interestingness, clarity, fairness, and relevance. By incorporating these agents, businesses can streamline content generation and ensure consistent, high-quality output while minimizing reliance on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Artificial Intelligence in Healthcare and Education · Topic Modeling