DualReward: A Dynamic Reinforcement Learning Framework for Cloze Tests Distractor Generation

Tianyou Huang; Xinglu Chen; Jingshen Zhang; Xinying Qiu; Ruiying Niu

arXiv:2507.11875·cs.CL·July 17, 2025

DualReward: A Dynamic Reinforcement Learning Framework for Cloze Tests Distractor Generation

Tianyou Huang, Xinglu Chen, Jingshen Zhang, Xinying Qiu, Ruiying Niu

PDF

Open Access

TL;DR

DualReward is a reinforcement learning framework that adaptively generates high-quality distractors for cloze tests, outperforming existing methods especially on diverse datasets by balancing human-like and novel distractors.

Contribution

It introduces a dual reward structure with adaptive scaling for distractor generation, enhancing performance over state-of-the-art methods across multiple datasets.

Findings

01

Consistent improvement on CLOTH-F dataset.

02

Significant gains (3.48-3.86%) in P@1 on MCQ dataset.

03

Effective handling of varied question types and domains.

Abstract

This paper introduces DualReward, a novel reinforcement learning framework for automatic distractor generation in cloze tests. Unlike conventional approaches that rely primarily on supervised learning or static generative models, our method employs a dual reward structure with adaptive scaling that differentiates between human-created gold standard distractors and model-generated candidates. The framework dynamically adjusts reward signal intensity based on model performance and confidence. We evaluate our approach on both passage-level (CLOTH-F) and sentence-level (MCQ) cloze test datasets, demonstrating consistent improvements over state-of-the-art baselines. Experimental results show that our adaptive reward scaling mechanism provides modest but consistent benefits on homogeneous datasets (CLOTH-F) and more substantial improvements (3.48-3.86% in P@1) on diverse, cross-domain data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Fuzzy Logic and Control Systems · Robot Manipulation and Learning