Anatomizing Deep Learning Inference in Web Browsers
Qipeng Wang, Shiqi Jiang, Zhenpeng Chen, Xu Cao, Yuanchun Li, Aoyu Li,, Yun Ma, Ting Cao, Xuanzhe Liu

TL;DR
This paper provides the first comprehensive performance analysis of in-browser deep learning inference, revealing significant latency, memory, and QoE impacts across various devices and models.
Contribution
It introduces new QoE metrics for in-browser inference and offers extensive empirical analysis of performance gaps and contributing factors.
Findings
In-browser inference is significantly slower than native inference, up to 16.9 times on CPU.
Memory demands for in-browser inference can exceed 334 times the model size.
In-browser inference increases GUI rendering time by 67.2%, affecting user experience.
Abstract
Web applications have increasingly adopted Deep Learning (DL) through in-browser inference, wherein DL inference performs directly within Web browsers. The actual performance of in-browser inference and its impacts on the quality of experience (QoE) remain unexplored, and urgently require new QoE measurements beyond traditional ones, e.g., mainly focusing on page load time. To bridge this gap, we make the first comprehensive performance measurement of in-browser inference to date. Our approach proposes new metrics to measure in-browser inference: responsiveness, smoothness, and inference accuracy. Our extensive analysis involves 9 representative DL models across Web browsers of 50 popular PC devices and 20 mobile devices. The results reveal that in-browser inference exhibits a substantial latency gap, averaging 16.9 times slower on CPU and 4.9 times slower on GPU compared to native…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment
