RAPT: Retrieval-Augmented Post-hoc Thresholding for Multi-Label Classification
Lasal Jayawardena, Nirmalie Wiratunga, Ikechukwu Nkisi-Orji, Darren Nicol

TL;DR
RAPT is a versatile, post-hoc thresholding method that enhances multi-label classification accuracy by adaptively selecting label thresholds based on retrieval of similar cases, without retraining classifiers.
Contribution
It introduces a model-agnostic, retrieval-augmented wrapper that improves label set selection in industrial multi-label document understanding tasks.
Findings
RAPT outperforms static thresholding baselines across multiple benchmarks.
Achieves 0.87 Macro-F1 with metric learners, surpassing transformer variants and LLMs.
Requires significantly less inference time and GPU memory than LLM baselines.
Abstract
Industrial multi-label document understanding pipelines score candidate labels and threshold or rank them to form a label set per document. This early selection step directly affects the accuracy of downstream information extraction from the document, as well as the associated verification effort. In practice, OCR noise, label imbalance, instance-dependent label cardinality, and asymmetric error costs make global score thresholds brittle and hard to maintain as document formats evolve. We present RAPT, a deployment-oriented retrieval-augmented score thresholding wrapper, applied post-hoc to improve label set selection without retraining the underlying classifier. RAPT is a model-agnostic wrapper: any predictor that provides document representations for similarity search and per label confidence scores can be used, including metric learning encoders and fine-tuned transformer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
