Hypnopaedia-Aware Machine Unlearning via Psychometrics of Artificial Mental Imagery

Ching-Chun Chang; Kai Gao; Shuying Xu; Anastasia Kordoni; Christopher Leckie; Isao Echizen

arXiv:2410.05284·cs.CR·May 1, 2026

Hypnopaedia-Aware Machine Unlearning via Psychometrics of Artificial Mental Imagery

Ching-Chun Chang, Kai Gao, Shuying Xu, Anastasia Kordoni, Christopher Leckie, Isao Echizen

PDF

TL;DR

This paper introduces a cybernetic framework for detecting and unlearning neural backdoors in AI systems by leveraging psychometric techniques and artificial mental imagery to enhance security and robustness.

Contribution

It proposes a novel self-aware unlearning mechanism that autonomously detaches backdoor triggers using reverse engineering, statistical inference, and model inversion techniques.

Findings

01

Developed a method for detecting backdoor patterns through statistical inference.

02

Implemented artificial mental imagery to disrupt malicious optimisation pathways.

03

Achieved a stable equilibrium between knowledge fidelity and backdoor vulnerability.

Abstract

Neural backdoors represent insidious cybersecurity loopholes that render learning machinery vulnerable to unauthorised manipulations, potentially enabling the weaponisation of artificial intelligence with catastrophic consequences. A backdoor attack involves the clandestine infiltration of a trigger during the learning process, metaphorically analogous to hypnopaedia, where ideas are implanted into a subject's subconscious mind under the state of hypnosis or unconsciousness. When activated by a sensory stimulus, the trigger evokes a conditioned reflex that directs a machine to mount a predetermined response. In this study, we propose a cybernetic framework for constant surveillance of backdoor threats, driven by the dynamic nature of untrustworthy data sources. We develop a self-aware unlearning mechanism to autonomously detach a machine's behaviour from the backdoor trigger. Through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.