# Using Epidemiological Test Diagnostics to Select Fraud Detection Methods: Secondary Analysis of Quantitative Cross-Sectional Survey Data

**Authors:** Rachel Willard-Grace, Tali Klima, Mansi Dedhia, Emily Lo, Annie Nisnevich, Allison Gray, Holly Henry

PMC · DOI: 10.2196/85161 · 2026-03-05

## TL;DR

This study evaluates how well different methods detect bot attacks in surveys, showing that most are ineffective and suggesting better strategies to protect survey data integrity.

## Contribution

The paper introduces an epidemiological framework to assess fraud detection methods in surveys, revealing their limitations and proposing more effective strategies.

## Key findings

- Most recommended methods for detecting bot attacks in surveys have low sensitivity and perform worse than chance.
- Combinations of fraud markers and repeated response blocks are more effective at identifying bot attacks.
- Failure to remove fraudulent records significantly alters survey results, especially in demographics and health care outcomes.

## Abstract

Survey research has the potential to elevate the experiences and opinions of marginalized populations. The rising number of bot attacks, a method of participant fraud that creates multiple records in survey data using automated software, threatens to drown out those voices and produce inaccurate findings. Rapid identification and mitigation of bot attacks are vital; however, there is limited guidance for researchers on scalable approaches to address this problem.

This study aimed to assess how well recommended methods detect fraud using an epidemiological diagnostic test framework to inform web-based survey researchers on how best to identify and shut down bot attacks.

We analyzed data from a cross-sectional web-based statewide survey on access to pediatric subspecialty care in California that used Qualtrics survey software. Caregivers of children with chronic conditions were recruited through family resource centers (FRCs), nonprofit agencies serving families with developmental delays and chronic medical conditions. The survey was sent out to 17 FRCs, whose staff distributed anonymous links to their clients through listservs and flyers. Respondents who completed the survey received a US $30 gift card. Prior to launch, we designed a protocol to identify and respond to bot attacks and reviewed responses for markers of fraudulent activity. If markers were identified or there was a spike in responses, a senior member of our research team reviewed patterns among all submitted surveys for each FRC to look for signs of bot attacks. We calculated epidemiologic measures of diagnostic test accuracy, such as sensitivity, specificity, positive predictive value, and negative predictive value, which describe a test’s ability to distinguish “disease” (in this case, fraudulent records) from normal cases, to better understand the utility of recommended strategies to identify bot attacks.

We received 646 valid survey records and 905 fraudulent records resulting from bot attacks. The primary indicator of a bot attack was a sudden spike in responses to the survey. Differences in demographics and outcomes, including wait times for pediatric subspecialty care and use of health care services, between the valid and fraudulent data indicated that failure to remove fraudulent records would have substantially altered the survey results. Most recommended methods in the literature for identifying fraudulent responses had low sensitivity to detect bot attacks, and only 2 were better than chance alone at correctly identifying bot attacks. Combinations of fraud markers and blocks of repeated responses were particularly useful to identify bot attacks.

Fraudulent data entry using bots is increasing in survey research. Sharing flexible protocols to identify and mitigate them in a way that is responsive to their ever-changing nature is vital to ensuring that researchers elevate the voices of real people within survey research to inform policy and programmatic discussions.

## Full-text entities

- **Diseases:** developmental delays (MESH:D002658), bot attacks (MESH:D009203)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12978920/full.md

---
Source: https://tomesphere.com/paper/PMC12978920