ChEmREF: Evaluating Language Model Readiness for Chemical Emergency Response

Risha Surana; Qinyuan Ye; Swabha Swayamdipta

arXiv:2511.10027·cs.AI·November 18, 2025

ChEmREF: Evaluating Language Model Readiness for Chemical Emergency Response

Risha Surana, Qinyuan Ye, Swabha Swayamdipta

PDF

Open Access

TL;DR

This paper introduces ChEmREF, a comprehensive benchmark to evaluate language models' ability to assist in chemical emergency response tasks, highlighting current capabilities and limitations.

Contribution

The paper presents ChEmREF, a new benchmark with tasks for chemical representation, emergency response, and knowledge question answering, to assess language models in HAZMAT scenarios.

Findings

01

Models achieved 68.0% accuracy in chemical representation translation.

02

Models scored 52.7% on incident response recommendations.

03

Models reached 63.9% accuracy on chemical safety exams.

Abstract

Emergency responders managing hazardous material HAZMAT incidents face critical, time-sensitive decisions, manually navigating extensive chemical guidelines. We investigate whether today's language models can assist responders by rapidly and reliably understanding critical information, identifying hazards, and providing recommendations. We introduce the Chemical Emergency Response Evaluation Framework (ChEmREF), a new benchmark comprising questions on 1,035 HAZMAT chemicals from the Emergency Response Guidebook and the PubChem Database. ChEmREF is organized into three tasks: (1) translation of chemical representation between structured and unstructured forms (e.g., converting C2H6O to ethanol), (2) emergency response generation (e.g., recommending appropriate evacuation distances) and (3) domain knowledge question answering from chemical safety and certification exams. Our best…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsChemical Safety and Risk Management · Risk and Safety Analysis · Machine Learning in Materials Science