LongSafety: Enhance Safety for Long-Context LLMs

Mianqiu Huang; Xiaoran Liu; Shaojun Zhou; Mozhi Zhang; Qipeng Guo,; Linyang Li; Chenkun Tan; Yang Gao; Pengyu Wang; Linlin Li; Qun Liu; Yaqian; Zhou; Xipeng Qiu; Xuanjing Huang

arXiv:2411.06899·cs.CL·February 28, 2025

LongSafety: Enhance Safety for Long-Context LLMs

Mianqiu Huang, Xiaoran Liu, Shaojun Zhou, Mozhi Zhang, Qipeng Guo,, Linyang Li, Chenkun Tan, Yang Gao, Pengyu Wang, Linlin Li, Qun Liu, Yaqian, Zhou, Xipeng Qiu, Xuanjing Huang

PDF

Open Access 1 Repo

TL;DR

LongSafety introduces a comprehensive dataset to improve safety alignment in long-context large language models, addressing safety concerns unique to extended context scenarios and demonstrating enhanced safety performance.

Contribution

The paper presents LongSafety, a novel safety dataset for long-context LLMs, and shows that training with it improves safety without sacrificing general capabilities.

Findings

01

Training with LongSafety improves long-context safety performance.

02

LongSafety enhances short-context safety and preserves model capabilities.

03

LongSafety generalizes across context lengths and safety scenarios.

Abstract

Recent advancements in model architectures and length extrapolation techniques have significantly extended the context length of large language models (LLMs), paving the way for their application in increasingly complex tasks. However, despite the growing capabilities of long-context LLMs, the safety issues in long-context scenarios remain underexplored. While safety alignment in short context has been widely studied, the safety concerns of long-context LLMs have not been adequately addressed. In this work, we introduce \textbf{LongSafety}, a comprehensive safety alignment dataset for long-context LLMs, containing 10 tasks and 17k samples, with an average length of 40.9k tokens. Our experiments demonstrate that training with LongSafety can enhance long-context safety performance while enhancing short-context safety and preserving general capabilities. Furthermore, we demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

luther-sparks/longsafetybench
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsResearch Data Management Practices · Scientific Computing and Data Management

MethodsSoftmax · Attention Is All You Need · ALIGN