Tuning for TraceTarnish: Techniques, Trends, and Testing Tangible Traits
Robert Dilworth

TL;DR
This paper evaluates and enhances the TraceTarnish stylometry-based authorship anonymization attack, analyzing stylometric features like function words and Type-Token Ratio to improve its effectiveness and forensic utility.
Contribution
It provides a rigorous evaluation of TraceTarnish, identifies key stylometric indicators, and proposes enhancements based on these features for better authorship anonymization and detection.
Findings
Function words and content words are significant indicators of stylometric changes.
Type-Token Ratio and specific word types are reliable forensic markers.
Enhanced TraceTarnish operations leverage these features for improved attack performance.
Abstract
In this study, we more rigorously evaluated our attack script , which leverages adversarial stylometry principles to anonymize the authorship of text-based messages. To ensure the efficacy and utility of our attack, we sourced, processed, and analyzed Reddit comments -- comments that were later alchemized into data -- to gain valuable insights. The transformed data was then further augmented by to manufacture stylometric features -- features that were culled using the Information Gain criterion, leaving only the most informative, predictive, and discriminative ones. Our results found that function words and function word types ( ); content words and content word types ( ); and the Type-Token Ratio ()…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
