Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference
William Thorne, Ambrose Robinson, Bohua Peng, Chenghua Lin, Diana, Maynard

TL;DR
This paper introduces a reinforcement learning approach to generate more challenging domain-specific question-answering datasets for cultural heritage, using synthetic preferences and existing models to improve evaluation tools.
Contribution
It proposes a novel method employing RLHF with synthetic data to increase question difficulty, addressing the lack of specialized datasets in cultural heritage MRC tasks.
Findings
The method effectively increases question difficulty as validated by human evaluation.
Empirical results show improved question complexity without sacrificing answerability.
Open-source tools facilitate reproducibility and adaptation for future research.
Abstract
As the cultural heritage sector increasingly adopts technologies like Retrieval-Augmented Generation (RAG) to provide more personalised search experiences and enable conversations with collections data, the demand for specialised evaluation datasets has grown. While end-to-end system testing is essential, it's equally important to assess individual components. We target the final, answering task, which is well-suited to Machine Reading Comprehension (MRC). Although existing MRC datasets address general domains, they lack the specificity needed for cultural heritage information. Unfortunately, the manual creation of such datasets is prohibitively expensive for most heritage institutions. This paper presents a cost-effective approach for generating domain-specific MRC datasets with increased difficulty using Reinforcement Learning from Human Feedback (RLHF) from synthetic preference data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Advanced Text Analysis Techniques · Multi-Agent Systems and Negotiation
MethodsEntropy Regularization · Sparse Evolutionary Training · Proximal Policy Optimization
