Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy
Aftab Hussain, Md Rafiqul Islam Rabin, Toufique Ahmed, Bowen Xu,, Premkumar Devanbu, Mohammad Amin Alipour

TL;DR
This paper reviews trojan attacks in large language models of code, introducing a trigger-based taxonomy to unify concepts and analyze how triggers influence model learning and security risks.
Contribution
It provides a comprehensive overview of trojan attacks on Code LLMs, proposing a novel trigger taxonomy and standardizing fundamental concepts in the field.
Findings
Introduces a unifying trigger taxonomy framework.
Provides a uniform definition of trojans in Code LLMs.
Analyzes how trigger design affects model learning.
Abstract
Large language models (LLMs) have provided a lot of exciting new capabilities in software development. However, the opaque nature of these models makes them difficult to reason about and inspect. Their opacity gives rise to potential security risks, as adversaries can train and deploy compromised models to disrupt the software development process in the victims' organization. This work presents an overview of the current state-of-the-art trojan attacks on large language models of code, with a focus on triggers -- the main design point of trojans -- with the aid of a novel unifying trigger taxonomy framework. We also aim to provide a uniform definition of the fundamental concepts in the area of trojans in Code LLMs. Finally, we draw implications of findings on how code models learn on trigger design.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Software Testing and Debugging Techniques · Software Engineering Research
MethodsFocus
