Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach
Jiayao Zhang, Hongming Zhang, Zhun Deng, Dan Roth

TL;DR
This paper investigates fairness disparities in peer review using large language models, analyzing biases related to author attributes and proposing baseline models for review generation and scoring.
Contribution
It introduces a comprehensive database for ICLR reviews, studies fairness disparities with language models, and provides baseline models for review automation tasks.
Findings
Bias varies across author attributes
Textual features help reduce predictive biases
Baseline models for review generation and scoring are established
Abstract
Double-blind peer review mechanism has become the skeleton of academic research across multiple disciplines including computer science, yet several studies have questioned the quality of peer reviews and raised concerns on potential biases in the process. In this paper, we conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs). We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date by aggregating data from OpenReview, Google Scholar, arXiv, and CSRanking, and extracting high-level features using language models. We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige. We observe that the level of disparity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems
