Suvach -- Generated Hindi QA benchmark

Vaishak Narayanan; Prabin Raj KP; Saifudheen Nouphal

arXiv:2404.19254·cs.CL·May 1, 2024

Suvach -- Generated Hindi QA benchmark

Vaishak Narayanan, Prabin Raj KP, Saifudheen Nouphal

PDF

Open Access 1 Datasets

TL;DR

Suvach introduces a new Hindi question answering benchmark created with large language models, aiming to provide a more accurate and bias-free evaluation tool for Hindi NLP research.

Contribution

The paper presents a novel Hindi QA benchmark generated by LLMs, addressing biases of existing translation-based datasets and offering a methodology applicable to other tasks.

Findings

01

New high-quality Hindi QA dataset created with LLMs

02

Addresses bias issues in existing benchmarks

03

Facilitates more accurate evaluation of Hindi NLP models

Abstract

Current evaluation benchmarks for question answering (QA) in Indic languages often rely on machine translation of existing English datasets. This approach suffers from bias and inaccuracies inherent in machine translation, leading to datasets that may not reflect the true capabilities of EQA models for Indic languages. This paper proposes a new benchmark specifically designed for evaluating Hindi EQA models and discusses the methodology to do the same for any task. This method leverages large language models (LLMs) to generate a high-quality dataset in an extractive setting, ensuring its relevance for the target language. We believe this new resource will foster advancements in Hindi NLP research by providing a more accurate and reliable evaluation tool.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Vaishak11a/Suvach
dataset· 7 dl
7 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBlind Source Separation Techniques · Neural Networks and Applications · Speech Recognition and Synthesis