Backdoor Attacks on Decentralised Post-Training

O\u{g}uzhan Ersoy; Nikolay Blagoev; Jona te Lintelo; Stefanos Koffas; Marina Kr\v{c}ek; Stjepan Picek

arXiv:2604.02372·cs.CR·April 6, 2026

Backdoor Attacks on Decentralised Post-Training

O\u{g}uzhan Ersoy, Nikolay Blagoev, Jona te Lintelo, Stefanos Koffas, Marina Kr\v{c}ek, Stjepan Picek

PDF

TL;DR

This paper introduces the first backdoor attack targeting pipeline parallelism in decentralised post-training of large language models, demonstrating its effectiveness even against safety alignment defenses.

Contribution

It presents a novel backdoor attack on pipeline parallelism, controlling an intermediate stage, which was previously unexplored and effective in causing model misalignment.

Findings

01

Backdoor reduces model alignment from 80% to 6%.

02

Attack remains successful in 60% of cases despite safety alignment training.

03

Limited adversary control over an intermediate stage suffices for effective backdoor injection.

Abstract

Decentralised post-training of large language models utilises data and pipeline parallelism techniques to split the data and the model. Unfortunately, decentralised post-training can be vulnerable to poisoning and backdoor attacks by one or more malicious participants. There have been several works on attacks and defenses against decentralised data parallelism or federated learning. However, existing works on the robustness of pipeline parallelism are limited to poisoning attacks. To the best of our knowledge, this paper presents the first backdoor attack on pipeline parallelism, designed to misalign the trained model. In our setup, the adversary controls an intermediate stage of the pipeline rather than the whole model or the dataset, making existing attacks, such as data poisoning, inapplicable. Our experimental results show that even such a limited adversary can inject the backdoor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.