Adversarial Attacks on Reinforcement Learning-based Medical Questionnaire Systems: Input-level Perturbation Strategies and Medical Constraint Validation

Peizhuo Liu

arXiv:2508.05677·cs.CR·August 11, 2025

Adversarial Attacks on Reinforcement Learning-based Medical Questionnaire Systems: Input-level Perturbation Strategies and Medical Constraint Validation

Peizhuo Liu

PDF

Open Access

TL;DR

This paper evaluates the vulnerability of RL-based medical questionnaire systems to adversarial attacks, demonstrating that even with strict medical constraints, these systems remain significantly susceptible to input perturbations.

Contribution

The study formulates the diagnosis process as an MDP, implements six attack methods, and develops a medical validation framework with 247 constraints to generate plausible adversarial examples.

Findings

01

Achieved 97.6% success rate in generating plausible adversarial samples.

02

Attack success rates ranged from 33.08% to 64.70%.

03

RL-based systems are vulnerable even under strict medical input constraints.

Abstract

RL-based medical questionnaire systems have shown great potential in medical scenarios. However, their safety and robustness remain unresolved. This study performs a comprehensive evaluation on adversarial attack methods to identify and analyze their potential vulnerabilities. We formulate the diagnosis process as a Markov Decision Process (MDP), where the state is the patient responses and unasked questions, and the action is either to ask a question or to make a diagnosis. We implemented six prevailing major attack methods, including the Fast Gradient Signed Method (FGSM), Projected Gradient Descent (PGD), Carlini & Wagner Attack (C&W) attack, Basic Iterative Method (BIM), DeepFool, and AutoAttack, with seven epsilon values each. To ensure the generated adversarial examples remain clinically plausible, we developed a comprehensive medical validation framework consisting of 247 medical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Mental Health Research Topics