ESGBench: A Benchmark for Explainable ESG Question Answering in Corporate Sustainability Reports
Sherine George, Nithish Saji

TL;DR
ESGBench is a new benchmark dataset and evaluation framework for assessing explainable ESG question answering systems using corporate sustainability reports, emphasizing model reasoning, factual accuracy, and domain relevance.
Contribution
It introduces a comprehensive benchmark with domain-grounded questions, human-curated answers, and evidence, facilitating fine-grained evaluation of ESG-focused AI models.
Findings
State-of-the-art LLMs face challenges in factual consistency.
Traceability and domain alignment remain key issues.
ESGBench accelerates research in transparent ESG AI systems.
Abstract
We present ESGBench, a benchmark dataset and evaluation framework designed to assess explainable ESG question answering systems using corporate sustainability reports. The benchmark consists of domain-grounded questions across multiple ESG themes, paired with human-curated answers and supporting evidence to enable fine-grained evaluation of model reasoning. We analyze the performance of state-of-the-art LLMs on ESGBench, highlighting key challenges in factual consistency, traceability, and domain alignment. ESGBench aims to accelerate research in transparent and accountable ESG-focused AI systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
