Loading paper
Limitations of Automatic Relevance Assessments with Large Language Models for Fair and Reliable Retrieval Evaluation | Tomesphere