Why the Maximum Second Derivative of Activations Matters for Adversarial Robustness

Yunrui Yu; Hang Su; Jun Zhu

arXiv:2603.23860·cs.LG·March 26, 2026

Why the Maximum Second Derivative of Activations Matters for Adversarial Robustness

Yunrui Yu, Hang Su, Jun Zhu

PDF

Open Access

TL;DR

This paper explores how the curvature of activation functions, measured by their maximum second derivative, influences adversarial robustness, revealing an optimal curvature range that balances expressivity and stability across models and datasets.

Contribution

It introduces the Recursive Curvature-Tunable Activation Family (RCT-AF) for precise curvature control and demonstrates the existence of an optimal curvature range for adversarial robustness.

Findings

01

Optimal robustness occurs when max|σ''| is between 4 and 10.

02

Normalized Hessian diagonal norm has a U-shaped dependence on max|σ''|.

03

Activation curvature impacts the Hessian, affecting model robustness.

Abstract

This work investigates the critical role of activation function curvature -- quantified by the maximum second derivative $max ∣ σ^{''} ∣$ -- in adversarial robustness. Using the Recursive Curvature-Tunable Activation Family (RCT-AF), which enables precise control over curvature through parameters $α$ and $β$ , we systematically analyze this relationship. Our study reveals a fundamental trade-off: insufficient curvature limits model expressivity, while excessive curvature amplifies the normalized Hessian diagonal norm of the loss, leading to sharper minima that hinder robust generalization. This results in a non-monotonic relationship where optimal adversarial robustness consistently occurs when $max ∣ σ^{''} ∣$ falls within 4 to 10, a finding that holds across diverse network architectures, datasets, and adversarial training methods. We provide theoretical insights into how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Stochastic Gradient Optimization Techniques