SafeCOMM: A Study on Safety Degradation in Fine-Tuned Telecom Large Language Models

Aladin Djuhera; Swanand Ravindra Kadhe; Farhan Ahmed; Syed Zawad; Fernando Koch; Walid Saad; Holger Boche

arXiv:2506.00062·cs.CY·February 9, 2026

SafeCOMM: A Study on Safety Degradation in Fine-Tuned Telecom Large Language Models

Aladin Djuhera, Swanand Ravindra Kadhe, Farhan Ahmed, Syed Zawad, Fernando Koch, Walid Saad, Holger Boche

PDF

Open Access 1 Datasets

TL;DR

This paper investigates how fine-tuning large language models for telecom tasks can degrade safety, introduces a new telecom-specific safety benchmark, and evaluates methods to restore safety without harming task performance.

Contribution

It introduces TeleHarm, the first telecom-specific safety benchmark, and evaluates safety degradation and realignment methods for telecom-tuned LLMs.

Findings

01

Safety degrades even with light telecom domain adaptation.

02

Proposed defenses effectively restore safety.

03

Safety alignment is lacking in publicly available TeleLLMs.

Abstract

Fine-tuning large language models (LLMs) on telecom datasets is a common practice to adapt general-purpose models to the telecom domain. However, little attention has been paid to how this process may compromise model safety. Recent research has shown that even benign fine-tuning can degrade the safety alignment of LLMs, causing them to respond to harmful or unethical user queries. In this paper, we investigate this issue by fine-tuning LLMs on three representative telecom datasets and show that safety degrades even for light telecom domain adaptation. To this end, we introduce TeleHarm, the first telecom-specific red-teaming benchmark, which we use alongside established DirectHarm and HexPhi datasets to systematically assess harmful behavior. We further extend our analysis to publicly available TeleLLMs that were continually pre-trained on large telecom corpora, revealing that safety…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

aladinDJ/TeleHarm
dataset· 47 dl
47 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsSoftmax · Attention Is All You Need