PassTSL: Modeling Human-Created Passwords through Two-Stage Learning
Yangde Wang, Haozhang Li, Weidong Qiu, Shujun Li, Peng Tang

TL;DR
PassTSL leverages two-stage learning inspired by NLP to model human-created passwords, significantly improving password guessing and strength estimation across multiple datasets with minimal additional training data.
Contribution
Introduces PassTSL, a novel two-stage learning framework for password modeling, and demonstrates its effectiveness in password guessing and strength estimation with minimal finetuning data.
Findings
Outperforms five state-of-the-art password cracking methods by up to 64.69%.
Achieves more accurate password strength estimation with fewer unsafe errors.
Small additional training data (0.1%) improves guessing accuracy by over 3%.
Abstract
Textual passwords are still the most widely used user authentication mechanism. Due to the close connections between textual passwords and natural languages, advanced technologies in natural language processing (NLP) and machine learning (ML) could be used to model passwords for different purposes such as studying human password-creation behaviors and developing more advanced password cracking methods for informing better defence mechanisms. In this paper, we propose PassTSL (modeling human-created Passwords through Two-Stage Learning), inspired by the popular pretraining-finetuning framework in NLP and deep learning (DL). We report how different pretraining settings affected PassTSL and proved its effectiveness by applying it to six large leaked password databases. Experimental results showed that it outperforms five state-of-the-art (SOTA) password cracking methods on password…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
