Monocle: Hybrid Local-Global In-Context Evaluation for Long-Text Generation with Uncertainty-Based Active Learning
Xiaorong Wang, Ting Yang, Zhu Zhang, Shuo Wang, Zihan Zhou, Liner Yang, Zhiyuan Liu, Maosong Sun

TL;DR
This paper introduces Monocle, a hybrid local-global evaluation framework for long-text generation that uses divide-and-conquer scoring, human-in-the-loop learning, and active learning to improve assessment accuracy and reduce annotation costs.
Contribution
Monocle presents a novel divide-and-conquer evaluation method combined with hybrid in-context learning and active learning, enhancing long-text assessment with human feedback integration.
Findings
Outperforms baseline evaluation methods in accuracy.
Effectively reduces annotation costs through active learning.
Improves correlation with human judgment in long-text evaluation.
Abstract
Assessing the quality of long-form, model-generated text is challenging, even with advanced LLM-as-a-Judge methods, due to performance degradation as input length increases. To address this issue, we propose a divide-and-conquer approach, which breaks down the comprehensive evaluation task into a series of localized scoring tasks, followed by a final global assessment. This strategy allows for more granular and manageable evaluations, ensuring that each segment of the text is assessed in isolation for both coherence and quality, while also accounting for the overall structure and consistency of the entire piece. Moreover, we introduce a hybrid in-context learning approach that leverages human annotations to enhance the performance of both local and global evaluations. By incorporating human-generated feedback directly into the evaluation process, this method allows the model to better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducation and Critical Thinking Development · Topic Modeling · Advanced Text Analysis Techniques
MethodsALIGN
