From Fragments to Facts: A Curriculum-Driven DPO Approach for Generating Hindi News Veracity Explanations
Pulkit Bansal, Raghvendra Kumar, Shakti Singh, Adam Jatowt, Sriparna Saha

TL;DR
This paper presents a curriculum-driven DPO framework for generating reliable Hindi news explanations, addressing misinformation in low-resource languages by aligning machine outputs with human reasoning.
Contribution
It introduces a novel DPO-based approach with curriculum learning and new parameters, improving explanation quality for Hindi news verification.
Findings
Effective in generating coherent explanations with LLMs and PLMs
Enhances explanation quality using Actuality and Finesse parameters
Scalable method for low-resource language misinformation detection
Abstract
In an era of rampant misinformation, generating reliable news explanations is vital, especially for under-represented languages like Hindi. Lacking robust automated tools, Hindi faces challenges in scaling misinformation detection. To bridge this gap, we propose a novel framework integrating Direct Preference Optimization (DPO) with curriculum learning to align machine-generated explanations with human reasoning. Fact-checked explanations from credible sources serve as preferred responses, while LLM outputs highlight system limitations and serve as non-preferred responses. To refine task-specific alignment, we introduce two key parameters -- Actuality and Finesse -- into the DPO loss function, enhancing explanation quality and consistency. Experiments with LLMs (Mistral, Llama, Gemma) and PLMs (mBART, mT5) confirm the framework's effectiveness in generating coherent, contextually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
