Are the confidence scores of reviewers consistent with the review content? Evidence from top conference proceedings in AI

Wenqing Wu; Haixu Xi; Chengzhi Zhang

arXiv:2505.15031·cs.CL·May 22, 2025·Scientometrics

Are the confidence scores of reviewers consistent with the review content? Evidence from top conference proceedings in AI

Wenqing Wu, Haixu Xi, Chengzhi Zhang

PDF

1 Repo

TL;DR

This study investigates the alignment between reviewer confidence scores and review content in top AI conference papers, using NLP techniques to analyze text-score consistency and its relation to paper outcomes.

Contribution

It introduces a deep learning-based framework for fine-grained analysis of review content and confidence scores, revealing high consistency and insights into review fairness.

Findings

01

High text-score consistency at multiple levels

02

Higher confidence scores correlate with paper rejection

03

Validation of review fairness and expert assessment

Abstract

Peer review is vital in academia for evaluating research quality. Top AI conferences use reviewer confidence scores to ensure review reliability, but existing studies lack fine-grained analysis of text-score consistency, potentially missing key details. This work assesses consistency at word, sentence, and aspect levels using deep learning and NLP conference review data. We employ deep learning to detect hedge sentences and aspects, then analyze report length, hedge word/sentence frequency, aspect mentions, and sentiment to evaluate text-score alignment. Correlation, significance, and regression tests examine confidence scores' impact on paper outcomes. Results show high text-score consistency across all levels, with regression revealing higher confidence scores correlate with paper rejection, validating expert assessments and peer review fairness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

njust-winchy/confidence_score
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.