Detecting Compressed Cleartext Traffic from Consumer Internet of Things   Devices

Daniel Hahn; Noah Apthorpe; Nick Feamster

arXiv:1805.02722·cs.CR·May 9, 2018·20 cites

Detecting Compressed Cleartext Traffic from Consumer Internet of Things Devices

Daniel Hahn, Noah Apthorpe, Nick Feamster

PDF

Open Access

TL;DR

This paper introduces a novel machine learning approach to distinguish encrypted from compressed unencrypted network traffic in IoT devices, addressing a gap in privacy protection and device auditing.

Contribution

It presents the first method to automatically differentiate encrypted from compressed unencrypted traffic at the packet level using machine learning models.

Findings

01

Achieved up to 66.9% accuracy with CNN on raw packet data.

02

Provided publicly available datasets for further research.

03

Established a baseline for this new classification problem.

Abstract

Data encryption is the primary method of protecting the privacy of consumer device Internet communications from network observers. The ability to automatically detect unencrypted data in network traffic is therefore an essential tool for auditing Internet-connected devices. Existing methods identify network packets containing cleartext but cannot differentiate packets containing encrypted data from packets containing compressed unencrypted data, which can be easily recovered by reversing the compression algorithm. This makes it difficult for consumer protection advocates to identify devices that risk user privacy by sending sensitive data in a compressed unencrypted format. Here, we present the first technique to automatically distinguish encrypted from compressed unencrypted network transmissions on a per-packet basis. We apply three machine learning models and achieve a maximum 66.9%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInternet Traffic Analysis and Secure E-voting · Digital Media Forensic Detection · Advanced Steganography and Watermarking Techniques