More Human, More Efficient: Aligning Annotations with Quantized SLMs

Jiayu Wang; Junyoung Lee

arXiv:2604.00586·cs.CL·April 2, 2026

More Human, More Efficient: Aligning Annotations with Quantized SLMs

Jiayu Wang, Junyoung Lee

PDF

1 Repo

TL;DR

This paper presents a method for fine-tuning a 1.7B parameter quantized Small Language Model to serve as a reliable, open-source evaluator and annotator, outperforming proprietary models in alignment and reproducibility.

Contribution

It introduces a novel fine-tuning pipeline for small, quantized LLMs that improves alignment with human annotations and offers a reproducible, privacy-preserving alternative to proprietary models.

Findings

01

Achieved a 0.23 point increase in Krippendorff's α over state-of-the-art proprietary LLMs.

02

Demonstrated the approach's effectiveness on a separate emotion classification task.

03

Provided an open-source implementation at https://github.com/jylee-k/slm-judge.

Abstract

As Large Language Model (LLM) capabilities advance, the demand for high-quality annotation of exponentially increasing text corpora has outpaced human capacity, leading to the widespread adoption of LLMs in automatic evaluation and annotation. However, proprietary LLMs often exhibit systematic biases that diverge from human expert consensus, lacks reproducibility, and raises data privacy concerns. Our work examines the viability of finetuning a quantized Small Language Model of 1.7B parameter size on limited human-annotated data to serve as a highly aligned, deterministic evaluator and annotator. By implementing a custom, multi-dimensional rubric framework and simple augmentation and regularization techniques, the proposed approach achieves higher inter-annotator agreement (0.23 points increase in Krippendorff's $α$ ) than the best performing state-of-the-art proprietary LLM. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jylee-k/slm-judge
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.