Active PETs: Active Data Annotation Prioritisation for Few-Shot Claim Verification with Pattern Exploiting Training
Xia Zeng, Arkaitz Zubiaga

TL;DR
This paper introduces Active PETs, a novel method for prioritizing data annotation in few-shot claim verification, leveraging ensemble models and pattern exploiting training to improve performance with limited labeled data.
Contribution
The paper presents Active PETs, a new weighted ensemble approach that actively selects unlabelled data for annotation, enhancing few-shot claim verification performance.
Findings
Active PETs outperforms baseline data selection methods.
Active PETs-o further improves results with oversampling.
Method is effective across multiple datasets and pretrained models.
Abstract
To mitigate the impact of the scarcity of labelled data on fact-checking systems, we focus on few-shot claim verification. Despite recent work on few-shot classification by proposing advanced language models, there is a dearth of research in data annotation prioritisation that improves the selection of the few shots to be labelled for optimal model performance. We propose Active PETs, a novel weighted approach that utilises an ensemble of Pattern Exploiting Training (PET) models based on various language models, to actively select unlabelled data as candidates for annotation. Using Active PETs for few-shot data selection shows consistent improvement over the baseline methods, on two technical fact-checking datasets and using six different pretrained language models. We show further improvement with Active PETs-o, which further integrates an oversampling strategy. Our approach enables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Engineering Research · Topic Modeling
