H-COAL: Human Correction of AI-Generated Labels for Biomedical Named Entity Recognition
Xiaojing Duan, John P. Lalor

TL;DR
H-COAL is a framework that leverages human correction of AI-generated labels to efficiently improve biomedical named entity recognition, significantly reducing human effort while approaching expert-level performance.
Contribution
This work introduces a novel framework for selectively correcting AI-generated labels, demonstrating substantial performance gains with minimal human effort.
Findings
Correcting 5% of labels closes 64% of the performance gap.
Correcting 20% of labels closes 86% of the performance gap.
Selective correction approaches near-human annotation quality efficiently.
Abstract
With the rapid advancement of machine learning models for NLP tasks, collecting high-fidelity labels from AI models is a realistic possibility. Firms now make AI available to customers via predictions as a service (PaaS). This includes PaaS products for healthcare. It is unclear whether these labels can be used for training a local model without expensive annotation checking by in-house experts. In this work, we propose a new framework for Human Correction of AI-Generated Labels (H-COAL). By ranking AI-generated outputs, one can selectively correct labels and approach gold standard performance (100% human labeling) with significantly less human effort. We show that correcting 5% of labels can close the AI-human performance gap by up to 64% relative improvement, and correcting 20% of labels can close the performance gap by up to 86% relative improvement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
Methodstravel james
