Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_{hi}$)
Ishan Kavathekar, Anku Rani, Ashmit Chamoli, Ponnurangam Kumaraguru,, Amit Sheth, Amitava Das

TL;DR
This paper investigates the detection of AI-generated Hindi text by evaluating multiple LLMs, introducing a new dataset, assessing detection techniques, and proposing a Hindi AI Detectability Index to measure text eloquence.
Contribution
It introduces a Hindi-specific AI detectability index and evaluates multiple detection methods on a new Hindi AI-generated text dataset, advancing multilingual AI detection research.
Findings
26 LLMs evaluated for Hindi text generation proficiency
Five detection techniques tested for Hindi AI text detection
Proposed Hindi AI Detectability Index ($ADI_{hi}$) to measure AI text eloquence
Abstract
The widespread adoption of Large Language Models (LLMs) and awareness around multilingual LLMs have raised concerns regarding the potential risks and repercussions linked to the misapplication of AI-generated text, necessitating increased vigilance. While these models are primarily trained for English, their extensive training on vast datasets covering almost the entire web, equips them with capabilities to perform well in numerous other languages. AI-Generated Text Detection (AGTD) has emerged as a topic that has already received immediate attention in research, with some initial methods having been proposed, soon followed by the emergence of techniques to bypass detection. In this paper, we report our investigation on AGTD for an indic language Hindi. Our major contributions are in four folds: i) examined 26 LLMs to evaluate their proficiency in generating Hindi text, ii) introducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Topic Modeling
MethodsSoftmax · Attention Is All You Need
