Climate Finance Bench
Rafik Mankour, Yassine Chafai, Hamada Saleh, Ghassen Ben Hassine, Thibaud Barreau, Peter Tankov

TL;DR
Climate Finance Bench provides a new benchmark dataset for question-answering on corporate climate disclosures, highlighting challenges in retrieval accuracy and advocating for transparent AI reporting practices.
Contribution
The paper introduces a novel dataset with annotated QA pairs on climate reports and compares RAG approaches, emphasizing retrieval as a key bottleneck.
Findings
Retrieval accuracy significantly impacts RAG performance.
Passage retrieval is the main bottleneck in answering questions.
Advocates for transparent carbon reporting in AI applications.
Abstract
Climate Finance Bench introduces an open benchmark that targets question-answering over corporate climate disclosures using Large Language Models. We curate 33 recent sustainability reports in English drawn from companies across all 11 GICS sectors and annotate 330 expert-validated question-answer pairs that span pure extraction, numerical reasoning, and logical reasoning. Building on this dataset, we propose a comparison of RAG (retrieval-augmented generation) approaches. We show that the retriever's ability to locate passages that actually contain the answer is the chief performance bottleneck. We further argue for transparent carbon reporting in AI-for-climate applications, highlighting advantages of techniques such as Weight Quantization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Sentiment Analysis and Opinion Mining · Forecasting Techniques and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Attention Dropout · Softmax · WordPiece · BART · Weight Decay · Multi-Head Attention
