Towards Answering Climate Questionnaires from Unstructured Climate   Reports

Daniel Spokoyny; Tanmay Laud; Tom Corringham; Taylor Berg-Kirkpatrick

arXiv:2301.04253·cs.CL·July 31, 2023·5 cites

Towards Answering Climate Questionnaires from Unstructured Climate Reports

Daniel Spokoyny, Tanmay Laud, Tom Corringham, Taylor Berg-Kirkpatrick

PDF

Open Access 3 Repos 1 Datasets

TL;DR

This paper introduces new datasets and models to extract structured information from unstructured climate reports, aiding climate change communication and policy-making.

Contribution

It presents large-scale climate questionnaire datasets, trains self-supervised models for text alignment, and establishes a benchmark for climate text classification.

Findings

01

Models generalize across organization types

02

Effective alignment of unstructured texts to questionnaires

03

Provides a new benchmark for climate NLP tasks

Abstract

The topic of Climate Change (CC) has received limited attention in NLP despite its urgency. Activists and policymakers need NLP tools to effectively process the vast and rapidly growing unstructured textual climate reports into structured form. To tackle this challenge we introduce two new large-scale climate questionnaire datasets and use their existing structure to train self-supervised models. We conduct experiments to show that these models can learn to generalize to climate disclosures of different organizations types than seen during training. We then use these models to help align texts from unstructured climate documents to the semi-structured questionnaires in a human pilot study. Finally, to support further NLP research in the climate domain we introduce a benchmark of existing climate text classification datasets to better evaluate and compare existing models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

iceberg-nlp/climabench
dataset· 24 dl
24 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Topic Modeling · Climate Change Communication and Perception

MethodsALIGN