LLM Benchmark-User Need Misalignment for Climate Change

Oucheng Liu; Lexing Xie; Jing Jiang

arXiv:2603.26106·cs.CL·March 30, 2026

LLM Benchmark-User Need Misalignment for Climate Change

Oucheng Liu, Lexing Xie, Jing Jiang

PDF

1 Repo 1 Datasets

TL;DR

This paper analyzes the mismatch between existing climate change benchmarks for LLMs and actual user needs, proposing a framework and taxonomy to improve benchmark relevance and guide future LLM development.

Contribution

It introduces a Proactive Knowledge Behaviors Framework and a Topic-Intent-Form taxonomy to assess and address benchmark-user need misalignment in climate change LLM applications.

Findings

01

Current benchmarks do not reflect real-world user needs.

02

Knowledge interaction patterns in LLMs resemble human-human interactions.

Abstract

Climate change is a major socio-scientific issue shapes public decision-making and policy discussions. As large language models (LLMs) increasingly serve as an interface for accessing climate knowledge, whether existing benchmarks reflect user needs is critical for evaluating LLM in real-world settings. We propose a Proactive Knowledge Behaviors Framework that captures the different human-human and human-AI knowledge seeking and provision behaviors. We further develop a Topic-Intent-Form taxonomy and apply it to analyze climate-related data representing different knowledge behaviors. Our results reveal a substantial mismatch between current benchmarks and real-world user needs, while knowledge interaction patterns between humans and LLMs closely resemble those in human-human interactions. These findings provide actionable guidance for benchmark design, RAG system development, and LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

OuchengLiu/LLM-Misalign-Climate-Change
github

Datasets

Westing/LLM-Misalign-Climate-Change
dataset· 234 dl
234 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.