Detecting Trojaned DNNs via Spectral Regression Analysis

Samuele Pasini; Jinhan Kim; Paolo Tonella

arXiv:2605.21146·cs.CR·May 21, 2026

Detecting Trojaned DNNs via Spectral Regression Analysis

Samuele Pasini, Jinhan Kim, Paolo Tonella

PDF

TL;DR

MIST is a spectral analysis-based method for detecting Trojaned neural networks during fine-tuning by identifying spectral deviations in internal representations, outperforming existing methods without needing trigger knowledge.

Contribution

This paper introduces MIST, a novel Trojan detection technique that leverages spectral regression analysis of model updates to identify malicious fine-tuning.

Findings

01

Spectral distances reliably distinguish Trojaned from clean updates.

02

MIST outperforms state-of-the-art detection methods after a single update.

03

Effective under multi-step benign evolution with bounded degradation.

Abstract

Modern DNNs are repeatedly fine-tuned to incorporate new data and functionality. This evolutionary workflow introduces a security risk when updated data cannot be fully trusted, as adversaries may implant Trojans during fine-tuning. We present MIST, a Trojan detection approach that analyzes how a model's internal representations change during fine-tuning. Rather than attempting to reconstruct trigger conditions, MIST characterizes benign model evolution using pre-activation spectra and flags updates whose spectral deviations are inconsistent with this reference. This framing treats Trojan detection as a regression problem over model updates. An empirical evaluation across four datasets and eight Trojan attacks shows that spectral distances reliably distinguish Trojaned updates from clean fine-tuning. MIST outperforms state-of-the-art detection accuracy after a single update, without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.