Positional-Unigram Byte Models for Generalized TLS Fingerprinting

Hector A. Valdez; Sean McPherson

arXiv:2405.07848·cs.CR·May 14, 2024

Positional-Unigram Byte Models for Generalized TLS Fingerprinting

Hector A. Valdez, Sean McPherson

PDF

Open Access

TL;DR

This paper introduces positional-unigram byte models for TLS fingerprinting, demonstrating robustness to cipher stunting by using statistical likelihoods of client hello messages to identify client applications.

Contribution

It presents a novel, data-driven approach using positional-unigram byte models and maximum likelihood for TLS fingerprinting that is resilient to cipher stunting and can be updated dynamically.

Findings

01

Robustness to cipher stunting demonstrated

02

High accuracy in client application identification

03

Method is adaptable and does not rely on side-channel info

Abstract

We use positional-unigram byte models along with maximum likelihood for generalized TLS fingerprinting and empirically show that it is robust to cipher stunting. Our approach creates a set of positional-unigram byte models from client hello messages. Each positional-unigram byte model is a statistical model of TLS client hello traffic created by a client application or process. To fingerprint a TLS connection, we use its client hello, and compute the likelihood as a function of a statistical model. The statistical model that maximizes the likelihood function is the predicted client application for the given client hello. Our data driven approach does not use side-channel information and can be updated on-the-fly. We experimentally validate our method on an internal dataset and show that it is robust to cipher stunting by tracking an unbiased $f_{1}$ score as we synthetically increase…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Natural Language Processing Techniques · Handwritten Text Recognition Techniques