The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review

Buxin Su; Jiayao Zhang; Natalie Collina; Yuling Yan; Didong Li; Kyunghyun Cho; Jianqing Fan; Aaron Roth; Weijie Su

arXiv:2408.13430·stat.AP·September 24, 2025

The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review

Buxin Su, Jiayao Zhang, Natalie Collina, Yuling Yan, Didong Li, Kyunghyun Cho, Jianqing Fan, Aaron Roth, Weijie Su

PDF

Open Access 1 Repo

TL;DR

This study analyzes how author-provided rankings during ICML 2023 can be used to calibrate review scores, improving the accuracy of peer review assessments and supporting conference decision processes.

Contribution

It introduces an empirical analysis of author rankings, demonstrating their effectiveness in calibrating review scores and proposes practical applications for conference review management.

Findings

01

Author rankings improve score calibration accuracy.

02

Calibrated scores outperform raw review scores in estimating true quality.

03

Practical applications include aiding senior chairs and award decisions.

Abstract

We conducted an experiment during the review process of the 2023 International Conference on Machine Learning (ICML), asking authors with multiple submissions to rank their papers based on perceived quality. In total, we received 1,342 rankings, each from a different author, covering 2,592 submissions. In this paper, we present an empirical analysis of how author-provided rankings could be leveraged to improve peer review processes at machine learning conferences. We focus on the Isotonic Mechanism, which calibrates raw review scores using the author-provided rankings. Our analysis shows that these ranking-calibrated scores outperform the raw review scores in estimating the ground truth ``expected review scores'' in terms of both squared and absolute error metrics. Furthermore, we propose several cautious, low-risk applications of the Isotonic Mechanism and author-provided rankings in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BuxinSu/ICML_Ranking
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReliability and Agreement in Measurement · Computational and Text Analysis Methods · Natural Language Processing Techniques

MethodsFocus