Conversation for Non-verifiable Learning: Self-Evolving LLMs through Meta-Evaluation
Yuan Sui, Bryan Hooi

TL;DR
The paper introduces CoNL, a self-evolving framework for LLMs that uses multi-agent self-play to improve evaluation and generation capabilities without external ground truth.
Contribution
It presents a novel meta-evaluation framework enabling LLMs to self-improve through structured multi-agent conversations and critique-based training.
Findings
CoNL outperforms self-rewarding baselines in various benchmarks.
The framework maintains stable training while enhancing evaluation and generation.
Meta-evaluation improves LLM performance without external labels.
Abstract
Training large language models (LLMs) for non-verifiable tasks, such as creative writing, dialogue, and ethical reasoning, remains challenging due to the absence of ground-truth labels. While LLM-as-Judge approaches offer a scalable alternative to human feedback, they face a fundamental limitation: performance is constrained by the evaluator's own quality. If the judge cannot recognize good solutions, it cannot provide useful training signals, and evaluation biases (e.g., favoring verbosity over quality) remain unaddressed. This motivates meta-evaluation: the ability to evaluate and improve the evaluator itself. We introduce CoNL, a framework that unifies generation, evaluation, and meta-evaluation through multi-agent self-play. Our key insight: critique quality can be measured by whether it helps others improve their solutions. In CoNL, multiple agents sharing the same policy engage in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
