T2T: Captioning Smartphone Activities Using Mobile Traffic
Jiyu Liu, Yong Huang, Yanzhao Lu, Yun Tie, Wanqing Tu

TL;DR
This paper introduces T2T, a system that generates textual descriptions of smartphone activities from encrypted mobile traffic, improving interpretability and scalability over traditional methods.
Contribution
The paper presents a novel Traffic-to-Text system that converts encrypted traffic features into readable activity captions using a combined traffic encoder and vision-language model.
Findings
T2T achieves BLEU-4 score of 58.1 and CIDEr score of 108.7.
T2T generates semantically accurate smartphone activity captions.
The system outperforms traditional classification methods in scalability and readability.
Abstract
This paper studies the creation of textual descriptions of user activities and interactions on smartphones. Our approach of referring to encrypted mobile traffic exceeds traditional smartphone activity classification methods in terms of model scalability and output readability. The paper addresses two obstacles to the realization of this idea: the semantic gap between traffic features and smartphone activity captions, and the lack of textually annotated traffic data. To overcome these challenges, we introduce a novel smartphone activity captioning system, called T2T (Traffic-to-Text). T2T consists of a flow feature encoder that converts low-level traffic characteristics into meaningful latent features and a caption decoder to yield readable transcripts of smartphone activities. In addition, T2T achieves the automatic textual annotation of mobile traffic by feeding synchronized screen…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
