SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large   Language Models in Scientific Tasks

Tianhao Li; Jingyu Lu; Chuangxin Chu; Tianyu Zeng; Yujia Zheng; Mei; Li; Haotian Huang; Bin Wu; Zuoxian Liu; Kai Ma; Xuejing Yuan; Xingkai Wang,; Keyan Ding; Huajun Chen; Qiang Zhang

arXiv:2410.03769·cs.CL·December 17, 2024·6 cites

SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large Language Models in Scientific Tasks

Tianhao Li, Jingyu Lu, Chuangxin Chu, Tianyu Zeng, Yujia Zheng, Mei, Li, Haotian Huang, Bin Wu, Zuoxian Liu, Kai Ma, Xuejing Yuan, Xingkai Wang,, Keyan Ding, Huajun Chen, Qiang Zhang

PDF

Open Access 1 Datasets

TL;DR

SciSafeEval is a comprehensive benchmark designed to evaluate the safety alignment of large language models across diverse scientific tasks and languages, including molecular and genomic data, to promote responsible AI deployment in science.

Contribution

Introduces SciSafeEval, the first extensive benchmark covering multiple scientific languages and domains to assess safety and alignment of LLMs in scientific research.

Findings

01

LLMs show varying safety performance across scientific domains.

02

The benchmark reveals gaps in safety guardrails against malicious prompts.

03

Evaluation in multiple settings highlights strengths and weaknesses of current models.

Abstract

Large language models (LLMs) have a transformative impact on a variety of scientific tasks across disciplines including biology, chemistry, medicine, and physics. However, ensuring the safety alignment of these models in scientific research remains an underexplored area, with existing benchmarks primarily focusing on textual content and overlooking key scientific representations such as molecular, protein, and genomic languages. Moreover, the safety mechanisms of LLMs in scientific tasks are insufficiently studied. To address these limitations, we introduce SciSafeEval, a comprehensive benchmark designed to evaluate the safety alignment of LLMs across a range of scientific tasks. SciSafeEval spans multiple scientific languages-including textual, molecular, protein, and genomic-and covers a wide range of scientific domains. We evaluate LLMs in zero-shot, few-shot and chain-of-thought…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Tianhao0x01/SciSafeEval
dataset· 96 dl
96 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management

MethodsFocus