LLM-based Evaluation Policy Extraction for Ecological Modeling
Qi Cheng, Licheng Liu, Qing Zhu, Runlong Yu, Zhenong Jin, Yiqun Xie, Xiaowei Jia

TL;DR
This paper introduces a novel LLM-based framework that learns interpretable evaluation policies for ecological models, improving upon traditional metrics by capturing domain-specific patterns and aligning with expert assessments.
Contribution
It presents a new method combining metric learning and LLMs to generate interpretable evaluation criteria tailored for ecological time series analysis.
Findings
Effective in capturing ecological assessment preferences
Works well on datasets for crop GPP and CO2 flux
Bridges gap between numerical metrics and expert knowledge
Abstract
Evaluating ecological time series is critical for benchmarking model performance in many important applications, including predicting greenhouse gas fluxes, capturing carbon-nitrogen dynamics, and monitoring hydrological cycles. Traditional numerical metrics (e.g., R-squared, root mean square error) have been widely used to quantify the similarity between modeled and observed ecosystem variables, but they often fail to capture domain-specific temporal patterns critical to ecological processes. As a result, these methods are often accompanied by expert visual inspection, which requires substantial human labor and limits the applicability to large-scale evaluation. To address these challenges, we propose a novel framework that integrates metric learning with large language model (LLM)-based natural language policy extraction to develop interpretable evaluation criteria. The proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Simulation Techniques and Applications · Statistical and Computational Modeling
