Learning from AVA: Early Lessons from a Curated and Trustworthy Generative AI for Policy and Development Research
Nimisha Karnatak, Mohamad Chatila, Daniel Alejandro Pinz\'on Hern\'andez, Reza Yazdanfar, Michelle Dugas, Renos Vakis

TL;DR
This paper introduces AVA, a trustworthy generative AI platform for policy research that emphasizes evidence-based responses, epistemic humility, and operational transparency, tested through extensive real-world evaluation.
Contribution
It presents AVA, a novel GenAI system built on curated sources with mechanisms for citation verification and abstention, along with design guidelines for trustworthy AI in policy contexts.
Findings
Sustained engagement saved 2.4-3.9 hours weekly for users.
Participants used AVA as a specialized evidence engine.
Trust was enhanced through provenance and citation anchoring.
Abstract
General-purpose LLMs pose misinformation risks for development and policy experts, lacking epistemic humility for verifiable outputs. We present AVA (AI + Verified Analysis), a GenAI platform built on a curated library of over 4,000 World Bank Reports with multilingual capabilities. AVA's multi-agent pipeline enables users to query and receive evidence-based syntheses. It operationalizes epistemic humility through two mechanisms: citation verifiability (tracing claims to sources) and reasoned abstention (declining unsupported queries with justification and redirection). We conducted an in-the-wild evaluation with over 2,200 individuals from heterogeneous organisations and roles in 116 countries, via log analysis, surveys, and 20 interviews. Difference-in-Differences estimates associate sustained engagement with 2.4-3.9 hours saved weekly. Qualitatively, participants used AVA as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
