Loading paper
RULERS: Locked Rubrics and Evidence-Anchored Scoring for Robust LLM Evaluation | Tomesphere