Increasing the Difficulty of Automatically Generated Questions via   Reinforcement Learning with Synthetic Preference

William Thorne; Ambrose Robinson; Bohua Peng; Chenghua Lin; Diana; Maynard

arXiv:2410.08289·cs.CL·October 14, 2024

Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference

William Thorne, Ambrose Robinson, Bohua Peng, Chenghua Lin, Diana, Maynard

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning approach to generate more challenging domain-specific question-answering datasets for cultural heritage, using synthetic preferences and existing models to improve evaluation tools.

Contribution

It proposes a novel method employing RLHF with synthetic data to increase question difficulty, addressing the lack of specialized datasets in cultural heritage MRC tasks.

Findings

01

The method effectively increases question difficulty as validated by human evaluation.

02

Empirical results show improved question complexity without sacrificing answerability.

03

Open-source tools facilitate reproducibility and adaptation for future research.

Abstract

As the cultural heritage sector increasingly adopts technologies like Retrieval-Augmented Generation (RAG) to provide more personalised search experiences and enable conversations with collections data, the demand for specialised evaluation datasets has grown. While end-to-end system testing is essential, it's equally important to assess individual components. We target the final, answering task, which is well-suited to Machine Reading Comprehension (MRC). Although existing MRC datasets address general domains, they lack the specificity needed for cultural heritage information. Unfortunately, the manual creation of such datasets is prohibitively expensive for most heritage institutions. This paper presents a cost-effective approach for generating domain-specific MRC datasets with increased difficulty using Reinforcement Learning from Human Feedback (RLHF) from synthetic preference data.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Advanced Text Analysis Techniques · Multi-Agent Systems and Negotiation

MethodsEntropy Regularization · Sparse Evolutionary Training · Proximal Policy Optimization