APEX: Probing Neural Networks via Activation Perturbation

Tao Ren; Xiaoyu Luo; Qiongxiu Li

arXiv:2602.03586·cs.LG·February 4, 2026

APEX: Probing Neural Networks via Activation Perturbation

Tao Ren, Xiaoyu Luo, Qiongxiu Li

PDF

Open Access

TL;DR

APEX introduces a novel activation perturbation method at inference time to probe neural network representations, revealing structural information and biases not accessible through traditional input or parameter analysis.

Contribution

This paper presents APEX, a new inference-time probing paradigm that perturbs hidden activations to explore neural network structure, surpassing limitations of prior input-space and parameter perturbation methods.

Findings

01

APEX effectively measures sample regularity and model biases.

02

Distinguishes structured from random models using activation perturbation.

03

Reveals training-induced biases like class concentration in backdoored models.

Abstract

Prior work on probing neural networks primarily relies on input-space analysis or parameter perturbation, both of which face fundamental limitations in accessing structural information encoded in intermediate representations. We introduce Activation Perturbation for EXploration (APEX), an inference-time probing paradigm that perturbs hidden activations while keeping both inputs and model parameters fixed. We theoretically show that activation perturbation induces a principled transition from sample-dependent to model-dependent behavior by suppressing input-specific signals and amplifying representation-level structure, and further establish that input perturbation corresponds to a constrained special case of this framework. Through representative case studies, we demonstrate the practical advantages of APEX. In the small-noise regime, APEX provides a lightweight and efficient measure of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI)