CriterAlign: Criterion-Centric Rationale Alignment for Code Preference Judging

Zhenyu Li; Aleksandar Cvejic; Zehui Chen; Peter Wonka

arXiv:2605.19665·cs.SE·May 20, 2026

CriterAlign: Criterion-Centric Rationale Alignment for Code Preference Judging

Zhenyu Li, Aleksandar Cvejic, Zehui Chen, Peter Wonka

PDF

TL;DR

CriterAlign is a novel framework that enhances pairwise code preference prediction by using criterion-centric judgments and human-preference-aligned guidance, outperforming traditional monolithic judges.

Contribution

It introduces a criterion-centric approach and HPAG to improve pairwise code evaluation accuracy over existing monolithic methods.

Findings

01

CriterAlign improves accuracy from 60.4% to 66.3% on BigCodeReward.

02

Pairwise criterion design significantly contributes to performance gains.

03

HPAG effectively aligns judgments with human preferences.

Abstract

Pairwise human preference prediction is central to evaluating code-generation systems, where quality often depends on task-specific trade-offs beyond functional correctness. While rubric-based LLM judges improve interpretability by decomposing evaluation into explicit criteria, most existing pipelines remain pointwise: they score each response independently and derive preferences by comparing aggregated scores. We show that this design is poorly matched to pairwise code preference prediction and can underperform a strong monolithic judge. We propose CriterAlign, a criterion-centric framework that adapts rubric-based judging to pairwise preference evaluation through direct criterion-level pairwise judgments, tie-driven criterion refinement, swap-consistency filtering, and final pairwise synthesis. We further introduce Human-Preference-Aligned Guidance (HPAG), synthesized offline from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.