BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot Detection
Daniel DeAlcala, Aythami Morales, Ruben Tolosana, Alejandro, Acien, Julian Fierrez, Santiago Hernandez, Miguel A. Ferrer and, Moises Diaz

TL;DR
This paper introduces a data-driven model for generating synthetic keystroke biometric data to enhance bot detection, demonstrating high realism and potential benefits in large-data scenarios.
Contribution
It presents a novel learning-based approach for keystroke data synthesis and compares it with statistical models, improving bot detection training.
Findings
Synthetic keystroke data is highly realistic.
Synthetic data improves bot detection accuracy in large datasets.
Challenges remain in few-shot learning scenarios.
Abstract
This work proposes a data driven learning model for the synthesis of keystroke biometric data. The proposed method is compared with two statistical approaches based on Universal and User-dependent models. These approaches are validated on the bot detection task, using the keystroke synthetic data to improve the training process of keystroke-based bot detection systems. Our experimental framework considers a dataset with 136 million keystroke events from 168 thousand subjects. We have analyzed the performance of the three synthesis approaches through qualitative and quantitative experiments. Different bot detectors are considered based on several supervised classifiers (Support Vector Machine, Random Forest, Gaussian Naive Bayes and a Long Short-Term Memory network) and a learning framework including human and synthetic samples. The experiments demonstrate the realism of the synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUser Authentication and Security Systems · Hand Gesture Recognition Systems · Advanced Malware Detection Techniques
