RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis
Joy Bose

TL;DR
RTI-Bench is a pioneering structured dataset of Indian CIC decisions, enabling analysis and prediction of RTI outcomes with high label accuracy and baseline performance.
Contribution
It introduces the first publicly available structured dataset for Indian RTI decisions, combining multiple sources and providing a foundation for future research.
Findings
Label coverage reaches 89% on the instruction-response corpus.
Baseline zero-shot model achieves 57.3% accuracy on outcome prediction.
Dataset is available at https://huggingface.co/datasets/joyboseroy/rti-bench.
Abstract
India's Right to Information Act, 2005 gives every citizen the right to demand information from public authorities, yet in practice most people cannot make sense of the dense administrative language used in Central Information Commission (CIC) decisions, let alone predict whether an appeal is worth filing. This paper introduces RTI-Bench, a structured dataset of CIC decisions with outcome labels, exemption citations, IRAC-style reasoning components, and procedural timelines. To the best of our knowledge it is the first publicly released structured dataset for Indian RTI administrative decisions. The dataset draws from two sources: 1,218 cases from a publicly available instruction-response corpus (with structured fields added through rule-based extraction), and 298 CIC decision PDFs collected directly from the Commission portal, spanning five commissioners and three document format…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
