RRTO: A High-Performance Transparent Offloading System for Model Inference in Mobile Edge Computing
Zekai Sun, Xiuxian Guan, Zheng Lin, Yuhao Qing, Haoze Song, Zihan Fang, Zhe Chen, Fangming Liu, Heming Cui, Wei Ni, Jun Luo

TL;DR
RRTO is a novel system that significantly reduces inference latency and energy consumption in mobile edge computing by enabling high-performance transparent offloading without source code modifications, using a record/replay mechanism and operator sequence search.
Contribution
RRTO introduces a record/replay mechanism and a novel operator sequence search algorithm to enable high-performance transparent offloading for ML inference in MEC.
Findings
Achieves up to 98% reduction in latency and energy consumption.
Matches the performance of non-transparent methods without source code changes.
Outperforms existing transparent offloading approaches significantly.
Abstract
Deploying Machine Learning (ML) applications on resource-constrained mobile devices remains challenging due to limited computational resources and poor platform compatibility. While Mobile Edge Computing (MEC) offers offloading-based inference paradigm using GPU servers, existing approaches are divided into non-transparent and transparent methods, with the latter necessitating modifications to the source code. Non-transparent offloading achieves high performance but requires intrusive code modification, limiting compatibility with diverse applications. Transparent offloading, in contrast, offers wide compatibility but introduces significant transmission delays due to per-operator remote procedure calls (RPCs). To overcome this limitation, we propose RRTO, the first high-performance transparent offloading system tailored for MEC inference. RRTO introduces a record/replay mechanism that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
