TL;DR
This paper demonstrates that automated question generation, especially with smaller fine-tuned models, can significantly improve fact-checking efficiency by providing more effective evidence than human questions in some cases.
Contribution
It shows that smaller, fine-tuned generative models can outperform large language models in question generation for fact-checking, with potential improvements in evidence retrieval.
Findings
Smaller models outperform large models by up to 8% in question generation.
Machine-generated questions can be more effective than human questions for evidence retrieval.
Manual evaluation confirms the high quality of generated questions.
Abstract
Verifying fact-checking claims poses a significant challenge, even for humans. Recent approaches have demonstrated that decomposing claims into relevant questions to gather evidence enhances the efficiency of the fact-checking process. In this paper, we provide empirical evidence showing that this question decomposition can be effectively automated. We demonstrate that smaller generative models, fine-tuned for the question generation task using data augmentation from various datasets, outperform large language models by up to 8%. Surprisingly, in some cases, the evidence retrieved using machine-generated questions proves to be significantly more effective for fact-checking than that obtained from human-written questions. We also perform manual evaluation of the decomposed questions to assess the quality of the questions generated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
