An Expert-grounded benchmark of General Purpose LLMs in LCA

Artur Donaldson; Bharathan Balaji; Cajetan Oriekezie; Manish Kumar; Laure Patouillard

arXiv:2510.19886·cs.CL·October 24, 2025

An Expert-grounded benchmark of General Purpose LLMs in LCA

Artur Donaldson, Bharathan Balaji, Cajetan Oriekezie, Manish Kumar, Laure Patouillard

PDF

Open Access

TL;DR

This study systematically evaluates eleven general-purpose large language models in life cycle assessment (LCA) tasks using expert reviews, revealing strengths in explanation quality but significant risks of inaccuracies and hallucinations, highlighting the need for careful application.

Contribution

First expert-grounded benchmark of LLMs in LCA, providing standardized evaluation across multiple models and criteria in a field lacking consensus protocols.

Findings

01

37% responses contained inaccuracies or misleading info

02

Hallucination rates up to 40% in some models

03

Open-source models perform comparably to closed-source models

Abstract

Purpose: Artificial intelligence (AI), and in particular large language models (LLMs), are increasingly being explored as tools to support life cycle assessment (LCA). While demonstrations exist across environmental and social domains, systematic evidence on their reliability, robustness, and usability remains limited. This study provides the first expert-grounded benchmark of LLMs in LCA, addressing the absence of standardized evaluation frameworks in a field where no clear ground truth or consensus protocols exist. Methods: We evaluated eleven general-purpose LLMs, spanning both commercial and open-source families, across 22 LCA-related tasks. Seventeen experienced practitioners reviewed model outputs against criteria directly relevant to LCA practice, including scientific accuracy, explanation quality, robustness, verifiability, and adherence to instructions. We collected 168…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEnvironmental Impact and Sustainability · Green IT and Sustainability · Machine Learning in Materials Science