Adversarial Attacks on Reinforcement Learning-based Medical Questionnaire Systems: Input-level Perturbation Strategies and Medical Constraint Validation
Peizhuo Liu

TL;DR
This paper evaluates the vulnerability of RL-based medical questionnaire systems to adversarial attacks, demonstrating that even with strict medical constraints, these systems remain significantly susceptible to input perturbations.
Contribution
The study formulates the diagnosis process as an MDP, implements six attack methods, and develops a medical validation framework with 247 constraints to generate plausible adversarial examples.
Findings
Achieved 97.6% success rate in generating plausible adversarial samples.
Attack success rates ranged from 33.08% to 64.70%.
RL-based systems are vulnerable even under strict medical input constraints.
Abstract
RL-based medical questionnaire systems have shown great potential in medical scenarios. However, their safety and robustness remain unresolved. This study performs a comprehensive evaluation on adversarial attack methods to identify and analyze their potential vulnerabilities. We formulate the diagnosis process as a Markov Decision Process (MDP), where the state is the patient responses and unasked questions, and the action is either to ask a question or to make a diagnosis. We implemented six prevailing major attack methods, including the Fast Gradient Signed Method (FGSM), Projected Gradient Descent (PGD), Carlini & Wagner Attack (C&W) attack, Basic Iterative Method (BIM), DeepFool, and AutoAttack, with seven epsilon values each. To ensure the generated adversarial examples remain clinically plausible, we developed a comprehensive medical validation framework consisting of 247 medical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Mental Health Research Topics
