Diffusing Your Mobile Apps: Extending In-Network Function Virtualization   to Mobile Function Offloading

Mario Almeida; Liang Wang; Jeremy Blackburn; Konstantina Papagiannaki,; Jon Crowcroft

arXiv:1906.06240·cs.DC·June 17, 2019

Diffusing Your Mobile Apps: Extending In-Network Function Virtualization to Mobile Function Offloading

Mario Almeida, Liang Wang, Jeremy Blackburn, Konstantina Papagiannaki,, Jon Crowcroft

PDF

Open Access

TL;DR

This paper introduces INFv, a novel in-network function virtualization system that enables mobile app offloading within ISP networks, significantly improving energy efficiency and execution speed while maintaining transparency and scalability.

Contribution

INFv extends NFV to mobile applications, providing a non-intrusive, scalable offloading framework that leverages in-network resources for energy and performance gains.

Findings

01

Up to 6.9x energy reduction in mobile devices

02

Up to 4x faster execution times

03

Effective load balancing and resource utilization

Abstract

Motivated by the huge disparity between the limited battery capacity of user devices and the ever-growing energy demands of modern mobile apps, we propose INFv. It is the first offloading system able to cache, migrate and dynamically execute on demand functionality from mobile devices in ISP networks. It aims to bridge this gap by extending the promising NFV paradigm to mobile applications in order to exploit in-network resources. In this paper, we present the overall design, state-of-the-art technologies adopted, and various engineering details in the INFv system. We also carefully study the deployment configurations by investigating over 20K Google Play apps, as well as thorough evaluations with realistic settings. In addition to a significant improvement in battery life (up to 6.9x energy reduction) and execution time (up to 4x faster), INFv has two distinct advantages over previous…

Tables1

Table 1. Table 1: Mobile Cloud Offloading (MCO) systems and properties. The comparison reveals that INFv supersedes the previous designs in many aspects. (red cross means the feature is not supported whereas green tick means the opposite.)

MCO	Partitions	Dynamic	No Repackage	Stock OS	Cloud	Network	Deployment
MAUI Cuervo et al. [2010]	Manual / Method	✗	✗	✓	✓	✗	✗
ThinkAir Kosta et al. [2012]	Manual / Method	✗	✗	✓	✓	✗	✓(EC2,cost)
CloneCloud Chun et al. [2011]	Auto / Thread	✓	✗	✗	✓	✗	✗
Comet Gordon et al. [2012]	Auto / Thread	–	✓	✗	✓	✗	✗
Zhang et al. Zhang et al. [2012]	Auto / Class	✗	✗	✓	✓	✗	✗
INFv	Auto / Class	✓	✓	✓	✓	✓	✓(cache,load)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Green IT and Sustainability · Caching and Content Delivery

Full text

\authorinfo

Mário AlmeidaUniversitat Politècnica de [email protected] \authorinfoLiang WangUniversity of [email protected] \authorinfoJeremy BlackburnTelefonica [email protected] \authorinfoKonstantina PapagiannakiTelefonica [email protected] \authorinfoJon CrowcroftUniversity of [email protected]

Diffusing Your Mobile Apps: Extending In-Network Function Virtualization to Mobile Function Offloading

Abstract

Motivated by the huge disparity between the limited battery capacity of user devices and the ever-growing energy demands of modern mobile apps, we propose INFv. It is the first offloading system able to cache, migrate and dynamically execute on demand functionality from mobile devices in ISP networks. It aims to bridge this gap by extending the promising NFV paradigm to mobile applications in order to exploit in-network resources.

In this paper, we present the overall design, state-of-the-art technologies adopted, and various engineering details in the INFv system. We also carefully study the deployment configurations by investigating over 20K Google Play apps, as well as thorough evaluations with realistic settings. In addition to a significant improvement in battery life (up to 6.9x energy reduction) and execution time (up to 4x faster), INFv has two distinct advantages over previous systems: 1) a non-intrusive offloading mechanism transparent to existing apps; 2) an inherent framework support to effectively balance computation load and exploit the proximity of in-network resources. Both advantages together enable a scalable and incremental deployment of computation offloading framework in practical ISPs’ networks.

1 Introduction

Pervasive mobile clients have given birth to complex mobile apps, many of which require a significant amount of computational power on users’ devices. Unfortunately, given current battery technology, these demanding apps impose a huge burden on energy constrained devices. While power hogging apps are responsible for 41% degradation of battery life on average Oliner et al. [2013], even popular ones such as social networks and instant messaging apps (e.g., Facebook and Skype) can drain a device’s battery up to nine times faster due only to maintaining an on-line presence Aucinas et al. [2013].

Recent work has proposed various solutions to offload and execute functionality of mobile apps remotely in a cloud, referred to as a mobile-to-cloud paradigm Cuervo et al. [2010]; Kemp et al. [2010]; Kosta et al. [2012]; Chen et al. [2012]; Zhang et al. [2012]; Gordon et al. [2012]; Chun et al. [2011]. Their evaluations have shown that the energy consumption of CPU intensive apps, e.g., multimedia processing apps and video games, can be reduced by an order of magnitude Chun et al. [2011]; Cuervo et al. [2010]. Beside the extended battery life, there are other benefits, such as faster execution time, responsiveness, and enhanced security by dynamic patching Mulliner et al. [2013].

However, prior work suffers from two limitations. First, they overlooked the potential of exploiting ISPs’ in-network resources for functionality offloading, simply using the network as a transmission fabric. Quite different from a decade ago, network middle boxes are no longer simple devices which only forward packets, often featuring multi-core general purpose processors xeo [2016] far more capable than those of mobile devices. In fact, many ISPs’ own network services have been shifting from specialized servers to generic hardware with the adoption of the NFV (Network Function Virtualization) paradigm111Telefonica aimed to shift 30% of their infrastructure to NFV technologies by the end of 2016 Telefonicaś view on virtualized mobile networks [2015]; Carta Blanco: NFV at Telefonica [2014]; Telefonica selects Ericsson for global UNICA program [2016]. Other providers such as AT&T AT&T Domain 2.0 Vision White Paper [2013], Vodafone Ericsson and Vodafone deploy first cloud-based VoLTE [2014], NTT Docomo DOCOMO Partners with Ericsson, Fujitsu and NEC for NFV Deployment [2015] and China Mobile Alcatel-Lucent and China Mobile conduct industry-first live field trial of a virtualized radio access network [2015] are following similar steps.. This paradigm can be naturally extended from basic network functions (e.g., packet filtering) to the more general functionality of mobile apps, exploiting “last-hop” proximity to effectively reduce latency, network load, and improve availability compared to a centralized mobile-to-cloud deployment. When deployed close to cellular towers (Radio Access Network), offloading latency could be reduced by up to 86.7%, reducing the energy consumption and execution time by up to 21.6% and 24.5%, respectively, when compared to a popular cloud alternative (Section 4.4). Furthermore, such a system could potentially be extended to adapt an app’s lifecycle to network conditions, further reducing devices’ energy consumption (e.g., delay network dependent background execution Almeida et al. [2016] in the case of congestion) and the volume of signaling offloaded to the Core Network (CN) Patel et al. [2014]. Unfortunately, previous systems either failed to address the challenges of deploying and scaling mobile code offloading systems at all, or overlooked the opportunities to effectively exploit in-network resources. Second, these solutions often utilize intrusive offloading techniques which either require custom OS distributions Chun et al. [2011]; Gordon et al. [2012], app repackaging Zhang et al. [2012], or even alternative app stores, not only increasing security risks and deployment costs, but also greatly increasing the barrier to the market adoption.

This paper presents INFv, the first mobile computation offloading system able to cache, migrate, and dynamically execute mobile app functionality on demand in an ISP network. It uses advanced interception and automatic app partitioning based on functionality clustering, combined with in-network load balancing algorithms to perform transparent, non-intrusive, in-network functionality offloading. INFv aims to bridge the gap between the limited battery capacity of user devices and the ever-growing energy demands of modern mobile appsOliner et al. [2013]; Aucinas et al. [2013] by extending the promising NFV paradigm to mobile apps. More specifically, we make the following contributions:

1) We present INFv’s data-driven architecture and design, along with key mechanisms and various technical details required to achieve non-intrusive offloading and adaptive in-network resource management.

2) We show that INFv is able to greatly improve apps energy consumption (reduced by up to 6.9x) and speed up app execution (up to 4x faster). It performs similar to, or better than, local execution 93.2% of the time over 4G while adapting to dynamic network conditions and up to 24.5% faster than a cloud alternative.

3) We compare different strategies to effectively balance functionality load in the network while reducing both end-user-experienced latency and request drop rates.

4) Through a real mobile app market study we show that app’s storage cost can be reduced by up to 93.5%, and that top apps have a median of 17 distinct functionality clusters, with up to 57% of offloadable code.

2 Background & Related Work

Mobile Code Offloading (MCO) is a reasonably well explored area (see Table 1), however earlier work has a few limitations that we directly address with INFv. In this section we provide an overview of previous work, focusing on the lessons taken away that we used to design INFv. MCO systems can be differentiated based on their granularity and partitioning decisions (what to offload), offloading techniques (how to offload), and runtime decisions (when to offload).

What to offload can be defined in a manual (app developer assisted) or in an automated manner. The first can be accomplished via programming frameworks and/or code annotations. For programming frameworks Kemp et al. [2010]; Chen et al. [2012]; Kosta et al. [2012], both local and remote execution alternatives have to be implemented according to the framework’s design constraints (e.g., concurrency models). In Maui Cuervo et al. [2010], annotations allow a partially automated offloading solution where developers select methods to offload. The benefit of these explicit systems is the level of customization; developers have a large degree of control over how their apps are offloaded.

An alternative approach taken by other work is to make automated offloading decisions Chun et al. [2011]; Gordon et al. [2012]; Zhang et al. [2012] by performing static and dynamic analysis of apps. While automated approaches give up a degree of flexibility, they benefit from being able to leverage the existing app ecosystem and general ease of use. That said, these systems do not really focus on the deployment characteristics of offloading. Thinkair Kosta et al. [2012] and Cloudlets Satyanarayanan et al. [2009], however do to some extent. Thinkair allows on-demand execution in a cloud environment. It provides 6 different VM types, with varying CPU and memory configurations. Mobile devices upload specially crafted apps to the cloud, and their local counterpart negotiates the on-demand execution on one of these VMs. A more robust approach (one that we have taken) is to support existing apps while handling resource negotiation and functionality caching in a fully automated and transparent manner. In particular, we want to ensure that INFv supports heterogeneous network topologies and load balances cached functionality in an intelligent manner. Cloudlets Satyanarayanan et al. [2009] in particular serves as a motivation for INFv as it highlights the impact of high latency in MCO to justify the need for deploying physically proximate decentralized clusters to execute functionality. INFv directly addresses the problems and challenges raised in this work by proposing an in-network solution (“Network” in Table 1) for MCO.

The majority of offloading literature proposes a method offloading granularity Cuervo et al. [2010]; Kemp et al. [2010]; Kosta et al. [2012]; Chen et al. [2012]. CloneCloud Chun et al. [2011] and Comet Gordon et al. [2012] propose full or partial thread granularity. Automated method granularity architectures incur the cost of synchronizing the serialized method caller objects, parameters, changed state and return object. Thread granularity architectures often need to synchronize thread state, virtual state, program counters, registers, stack or locks. INFv is the first to address the challenges behind caching app functionality, complementing its granularity design with a real market study to reduce its app storage requirements.

How to offload depends on the aforementioned characteristics. Manual approaches tend to use custom compilers (e.g., AspectJ) or builders. Automated solutions often operate on compiled apps and rely on byte-code rewriting Zhang et al. [2012] or VM modifications Chun et al. [2011]; Gordon et al. [2012]. Altering an app generally requires repackaging and resigning it, which also implies the need for a new distribution mechanism incompatible with current app markets. Each of the above architectures, except for Comet Gordon et al. [2012] (a distributed shared memory solution), need either app repackaging, re-writes, or both, impacting their likelihood of adoption. INFv offloading differs mainly in that it does not require a custom distribution, manual intervention or app-repackaging even in the presence of app updates. It allows for dynamically loaded partitions (“Dynamic” in Table 1) and is fully reversible.

When to offload was mostly done using thresholds Gordon et al. [2012] or Integer Linear Programming Cuervo et al. [2010]; Chun et al. [2011] based on the app profiling metrics.Similar to CloneCloud, INFv relies on UI instrumentation to profile the different execution paths of apps. INFv performs app profiling on remote servers to reduce the overhead on mobile devices and like Thinkair Kosta et al. [2012], its energy model is based on PowerTutor Zhang et al. [2010]. INFv improves previous systems by offloading together functionality with high communication based on their runtime invocation frequency.

Clone detection literature Mojica et al. [2014]; Linares-Vásquez et al. [2014]; Ruiz et al. [2012]; Desnos [2012] focused on detecting app cloning using class/method names and tend to ignore minor custom changes to the functionality. In code caching, we are more interested in the unmodified use of third-party libraries than similar code. Unlike recent studies Linares-Vásquez et al. [2014]; Mojica et al. [2014]; Wang et al. [2015], some initial works Ruiz et al. [2012]; Desnos [2012] did not consider obfuscation, which can impact the statistical significance of their results. In Mojica et al. [2014] and Linares-Vásquez et al. [2014] the authors study obfuscation based on class names. Unfortunately, they do not consider package obfuscation which affects the offloading routing mechanisms. Wukong et al. Wang et al. [2015] focused on clone detection based on Android API calls, and while it highlights the challenges in overcoming obfuscation, we have found strong evidence that, a simple package name filtering might suffice to detect obfuscation at a package level.

Mobile Edge Computing Computing [2015] (MEC) is an industry initiative222supported by Huawei, IBM, Intel, Nokia, NTT DOCOMO and Vodafone. whose goal is to provide computing capabilities at the edge of the cellular network. Its focus is explicitly on the infrastructure and deployment and not on the potential applications. That said, INFv can be considered an obvious use case of MEC and the first full-fledged MCO architecture to exploit its potential.

Runtime patching has been used to dynamically provide updates to apps. OPUS Altekar et al. [2005] focused on providing dynamic software patching to C programs, while POLUS Chen et al. [2007] was more focused on updating long-lived server side apps. More recently, such techniques were brought to Android with PatchDroid Mulliner et al. [2013]. It focused on security vulnerabilities and proposed a system to distribute and apply third-party in-memory security patches. Inspired by these systems, INFv modifies a single Android OS binary to extends app’s functionality at runtime, providing mechanisms similar to those of aspect oriented programming.

3 Design Goals & Architecture

Based on the limitations of previous work, and keeping in mind the recent availability of in-network resources, we identified four major design requirements for INFv as well as their corresponding challenges.

Instrumenting apps: App instrumentation is needed for profiling and providing offloading capabilities. How can apps be modified without performing app repackaging, using specialized compilers, custom OS or app stores?

Understanding apps: With instrumentation in place, how to detect which functionality is consuming the most energy or executing for longer? Can we analyze apps without incurring a performance penalty on mobile devices?

Offloading functionality: How to enable apps to make use of remote resources in a transparent and efficient manner with minimal adoption cost? Can offloading adapt to a dynamic network environment and still ensure its energy and performance benefits?

Caching functionality: How to cache functionality in a network, adapting to demand and reducing latency, even for arbitrary network topologies and heterogeneous nodes?

Deployment assumptions: Based on recent/expected NFV/MEC adoption by ISPs, we believe these standards will soon be widely available in the ISPs networks. More specifically, and in compliance with the MEC proposal, placed in the Radio Access Network (RAN), where traffic offloading functions can be implemented to filter packets based on their end-point Computing [2015]; via Home Node B [2011] (IP protocol).

The overall INFv architecture that addresses these challenges is depicted in Fig. 1. It is divided in three different logical subsystems (separated by dashed lines), each addressing the aforementioned challenges. The leftmost subsystem (Section 3.2) profiles and analyzes apps. The rightmost subsystem (Section 3.3) provides the on device offloading capabilities. It runs on the user device and executes code on a remote VM in the network. Finally, the third subsystem (Section 3.4) runs on the network nodes (NFV) and is responsible for caching functionality and balancing the computation load. In the next sections we detail each of them and describe their technical challenges.

3.1 Dynamic instrumentation

INFv’s mechanism to extend app functionality has to have minimal impact and high coverage of devices, i.e., it cannot rely on app repackaging, specialized compilers, custom OS or app stores. To avoid modifying app binaries and change apps’ signatures, INFv targets the minimal set of changes required to provide dynamic instrumentation – a reversible binary patch to Android’s app_process. This process is launched at boot (by the init.rc script) and launches Zygote – Android’s daemon responsible for launching apps, creating the Dalvik VM and pre-loading Java classes. When a new app is to be launched, this process is forked and the app executes on its own VM with its copy of the systems libraries. By extending this process INFv can add its own classes to the classpath and redirect method invocations to a generic redirection method that allows specifying methods to be intercepted (i.e., hooks).

There are a few dynamic instrumentation frameworks available for Android, that follow similar techniques, such as Xposed xpo , Cydia cyd and adbi adb . INFv is based on the first as it is by far the most popular. As shown in Fig. 2, INFv extends it to enable app profiling (Resource Manager – RM), manipulate app lifecycle (App/Thread Manager), perform remote method invocation (Hook Manager and the generic offloading hook – H) and instantiates a custom classloader (DexClassLoader) to dynamically load only its signed application strategies and functionality in the backend (Partition Manager). These are detailed in the next sections and their limitations and security concerns discussed in Section 6.

3.2 Profiling and Partitions

In MCO, apps are typically partitioned into a set of functionality to execute locally and another to execute remotely. Such systems need to detect good candidates to offload, e.g., based on energy consumption and execution time. Next, we describe how INFv partitions are designed and how the environment-dependent properties (e.g., network conditions) are considered for offloading decisions.

3.2.1 App Analysis & Profiling

Designing functionality partitions requires app knowledge. In INFv most of app analysis is done offline (left-most system in Fig 1) and is composed of two steps. The first consists of performing static analysis of the app, i.e., the app analysis platform retrieves an app from the Google Play (GP) store, decompiles it (using apktoolapk ) and parses both the manifest and smali files (i.e., disassembled Dalvik bytecode). From the first we retrieve the app properties, such as package name, version and app starting points (e.g., activities, broadcast receivers, etc). From the second we build a call graph where the vertexes represent classes and the edges method calls. This call graph is used to detect accesses to methods/classes that should execute locally, such as access to local hardware 333Android’s bluetooth, hardware, MTP, NFC and telephony packages. and UI 444Android’s Activity, View, Widget, Gesture, Text, Transition, Animation and Graphics packages.. Our static analysis tool 555We made our static analysis tools open-source bro [2016]; dro [2016]. includes a DSL to tag packages which we used to manually pre-tag most of the Android packages depending on their requirements.

The second step consists of attributing weights to the call graph edges based on their runtime invocation counts as well as metadata regarding vertices’ energy consumptions and execution times. Apps are executed in instrumented VMs and exercised using UI automation (reproducible pseudo-random input mon ). These VMs use the aforementioned instrumentation mechanisms to load the call graph on app start and intercept method invocations. We had to perform some optimizations to reduce the tracing overhead, such as increasing the VM heap size and restricting the tracing to app methods. The average of the intercepted methods parameters are registered in the call graph metadata – state size, thread, execution time, invocation counts – along with the predicted energy consumption based on the PowerTutor model Zhang et al. [2010]. This model uses the device states (CPU, screen, GPS, connectivity) to predict energy consumption. It can derive a per device power model (with a low error of up to 2.5%) by performing regression over the energy discharge patterns observed while looping through different device power states. The INFv dynamic analysis platform can accommodate physical devices to automate this process, or a one time power discharge execution can be run in users devices to retrieve the specific device model. We report that the minimal information needed by our offloading mechanisms requires on average 3MB per app when uncompressed (based on the top 30 market apps), which is reasonable even considering Android’s memory constraints (i.e., small heap size – 48MB in Galaxy S2).

While INFv’s approach saves the profiling energy on mobile devices as well as the overhead of method tracing, we do acknowledge some limitations. Some of the power model metrics (e.g., screen, GPS) are not accurate or observable in a VM. The VM screen is always on and some networks/sensors are emulated. While the second is unimportant as these should execute locally anyway, the first may overestimate energy consumption which later prevents some functionality from being offloaded. We discuss some of these limitations in Section 6.

3.2.2 App Partitions

Partitions define which subsets of functionality should be offloaded. In MCO, partitions are generally static and their outcome is a rebuilt app (potentially two disjoint apps) with some functionality re-purposed for offloading. In INFv’s client, partitions are just information regarding the offloading subset, i.e., class names and execution properties (e.g., execution time, energy consumption, network conditions). These sets are pushed to the device, loaded on app launch and drive app’s execution.

Partitions can be devised on method and thread granularity. The latter incurs an extra cost of synchronization (e.g., thread state, virtual state, program counters, registers, stack), and is often restricted to method boundaries Chun et al. [2011]. In order to better integrate with the NFV abstraction, our solution is able to provide the granularity of method offloading. However, since most mobile architectures apps are developed in class-based object-oriented languages, in practice we use a class offloading granularity as methods often invoke methods of the same class and share class state (e.g., class fields). Furthermore, there is a large overlap of classes and packages across apps, which minimizes the impact of storing app functionality in the network (Section 5.1).

One of the main concerns in MCO is the balance between offloading computation and its communication overhead due to remote method invocations. To reduce this overhead, we use the app call graph (Section 3.2.1), i.e., $G=(V,E)$ where $V$ is the set of app classes and $E$ the invocations between classes, to detect communities/clusters (sub-graphs of $G$ ) of $V$ based on their invocation counts. INFv uses a community detection algorithm – Girvan and Newman (GN) Girvan and Newman [2002] to detect edges that are likely between communities. To do so it recursively detects the edge with highest number of shortest paths between pairs of nodes passing through it (i.e., the edge with highest betweenness) and removes it. By removing edges with high betweenness, the communities get separated and we end up with distinct functionality clusters. Community detection was firstly proposed by Zhang et al. Zhang et al. [2012] which used static analysis invocations as edge weights. The problem is that these do not reflect the actual runtime invocation counts between classes, and so the study relies on a weight heuristic based on class semantic similarity, i.e., class names and their textual contents. Unfortunately, our app study in Section 5.1 indicates that 82% of the apps perform class name obfuscation and in some cases even the strings within classes can be obfuscated dex . As INFv executes apps, it registers the method’s runtime invocation counts, bypassing such limitations.

The GN algorithm receives the number of clusters as a parameter which we iterate from two to the optimal number of clusters as calculated via louvain’s Blondel et al. [2008] modularity (density of links within clusters) optimization. Any partition with classes tagged during static analysis as non offloadable is discarded. In Section 3.3.1 we discuss how these pre-calculated partition sets are picked for offloading at runtime.

3.3 Offloading Functionality

Once INFv fetches the partition metadata, it needs to provide offloading capabilities and a decision model to ensure that it executes faster and consumes less energy.

The INFv end device system (Fig.2) is composed by one or more monitors and one standalone process that provides a network stub and a resource monitoring (RM) system. Each app process transparently loads and executes an INFv monitor on start. The App Manager intercepts the app entry points declared in the manifest and binds to the per-device Network Stub. The Partition Manager (PM) loads the partitions and instructs the Hook Manager to hook their entry and exit points, enabling the interception of their respective members (methods & constructors). Once a target invocation is intercepted, the PM retrieves the current environment-dependent metrics from the RM and decides whether to offload it or not. If so, a message is created and sent by the AppConnector to the Network Stub service (via IPC) that transparently interacts with the closer network node to execute it. Network nodes abstract the network topology by providing a message queue (MQ) between the stub and the execution backend. Within the MQ abstraction, routing is done using the user, device, app and version IDs, along with the fully qualified member name and its arguments. The Thread Manager keeps a per-thread queue of the offloaded requests and suspends the threads until the invocation result is received or there is an intermediate invocation of a member in the same thread.

The INFv backend subsystem runs the same monitor functionality over a light-weight app process. It loads only the required functionality and listens for incoming requests (Request Manager) from the mobile device. On request it creates class instances and executes their respective methods while managing their state. Although mobile apps are expected to be short-lived (imposed by screen off events and CPU sleep mode), state has to be kept while there are references to it. In INFv, remote class instances are replaced locally with “light-weight” instances of the same class (bypassing their constructors obj [2016]), which, on interception, work as proxy objects. These objects are tracked using a custom weak identity hash map that allows the detection of de-referenced or garbage collected objects, which results in a state invalidation message being sent through the INFv network stub. App crashes or force quit also trigger state invalidation. If the crash is INFv’s specific, as apps are isolated, the instrumentation can be disabled for the specific app. Finally, when there are no requests or remaining references the VM can be paused after a certain period. More advanced distributed garbage collection mechanisms can be implemented that address some of aforementioned challenges Abdullahi and Ringwood [1998].

While INFv’s offloading model resembles Java RMI, not only does Android’s Java not support RMI by default, but it generally requires a remote interface (e.g., extending java.rmi.Remote). Refactoring classes is possible at runtime using bytecode manipulation but is either expensive if done at launch time or significantly increases the app size by up to 40% Zhang et al. [2012] when stored.

3.3.1 Runtime Decisions

At runtime the PM decides which of the (offline) pre-calculated partitions should be offloaded. This decision is based on the partition’s (offline) estimated execution parameters (Section 3.2) and the periodic device measurements and state monitored by the RM – network type, bandwidth and latency. In previous class offloading research Zhang et al. [2012], a partition would only be valid for offloading if, for each of its classes, all methods perform faster when offloaded. Unfortunately from our experience this rarely holds, even for CPU intensive classes. For example, simple methods such as an overridden toString() or equals(), will in most situations perform worse than local when offloaded due to the RTT. In INFv, instead of considering methods individually, for each class we aggregate its method’s execution and energy consumption based on their observed method invocation frequency ( $f_{i}$ ). Therefore, methods that perform better when offloaded can compensate for the least performing methods if these do not occur too often. A class $c$ is valid if: $\sum_{m=1}^{M}f_{m}*t_{m\_local}>\sum_{m=1}^{M}f_{m}*(RTT+(i_{m}+o_{m})/r+t_{m\_offload})$ , where $M$ represents the number of class methods and $f_{m}$ the method’s invocation count normalized by the total number of invocations observed for all the class methods. The size of the method’s input and output parameters are represented respectively as $i_{m}$ and $o_{m}$ . These are divided by the transmission rate ( $r$ ) and, along with the RTT, are only counted for methods interfacing between the local and remote partitions (partition boundary). Finally the local ( $t_{m\_local}$ ) and remote ( $t_{m\_offload}$ ) execution times are a function of the CPU frequency difference between the mobile device and the node.

Classes are similarly validated based on the energy consumption of their edges, except that a method’s energy consumption is only considered for edges that bridge partitions. For these methods (technically the edges between such classes), only the state transmission costs and the normalized invocation frequency are used when calculating the class’s energy consumption.

INFv validates the pre-calculated partitions by increasing the number of partitions ( $N$ ) until it finds a valid partition. In the worst case scenario $N$ is equal to the number of classes and valid classes are offloaded individually666In practice, a threshold is selected via modularity optimization Blondel et al. [2008] to avoid high link density classes to be executed separately..

Finally, the existing Android diagnosis resources (dumpsys) are used to report potential anomalies (e.g., high energy consumption) and processed (offline) by a negative feedback loop to reduce the deviations between estimated and observed values. An advantage of INFv over app repackaging systems is that it is able to dynamically load new partitions and metadata to handle such cases.

3.4 Network Subsystem

The network subsystem abstracts the communication between mobile devices and the executing backend. It can be deployed on ISP’s RANs where the latency to the User Equipment (UE) is minimal (15-45ms Laner et al. [2012]). Offloading requests share a common end-point IP, which can be intercepted directly at the base station Computing [2015]; via Home Node B [2011], where INFv terminates the traffic and performs further routing. INFv uses a pub-sub MQ system (as proposed by MEC) to store the processed requests while the routing decisions take place and to perform in-network communication.

Because nodes cache functions, there will be multiple copies in the network. The first job of the network subsystem is to route user requests to the closest copy. Functionality execution consumes both CPU and memory as well as other resources (e.g., bandwidth). INFv focuses on the first two since they are usually the most dominant resources. The second job of the network subsystem is to balance the load of executing functions. The goal of load balancing is achieved by strategically dropping or forwarding computation tasks to other nodes to avoid being overloaded. However, instead of distributing load uniformly over all available nodes, a service is better executed as close to a client as possible to minimize latency.

Centralized coordination is not ideal for practical deployment due to three reasons: 1) A central solver needs global knowledge of a network and maintaining such knowledge up-to-date is costly, 2) the optimal strategy needs to be calculated periodically given the dynamic nature of network and traffic, and 3) there is a single point failure. Therefore, we study and implement two basic heuristic strategies – passive & proactive in INFv. Both strategies have rather straightforward implementations and try to minimize latency. For the proactive one, we apply a simple $M/M/1$ -Processor Sharing (i.e., $M/M/1-PS$ ) queuing model to estimate the future queue length. The workload on each node can be further estimated based on the predicated queue length and periodically measured CPU and memory consumption of each function. Next we sketch the core idea behind each heuristic; please refer to Wang et al. [2016] for further algorithmic details.

Passive Control: Nodes execute as many requests as possible before being overloaded. If the node is overloaded, the requests are passed to the next hop along the path to a server, or dropped if the current node is already the last hop in the ISP network.

Proactive Control: Nodes execute requests conservatively to avoid being overloaded. To do so, a node estimates the request arrival rate to further estimate the potential consumption. If the estimate shows that the node may be overloaded, it executes some and forwards the rest to the next hop neighbor with the lightest load. NB: This strategy requires exchanging state information within a node’s one-hop neighborhood.

4 Architecture Evaluation

We first describe how a typical deployment of INFv then present thorough evaluations demonstrating that INFv delivers on its promise of energy savings and faster app execution.

INFvs use case: 1) The profiler analyzes apps and computes partition sets; 2) The user equipment (UE) installs apps and the local INFv installation downloads their partition metadata; 3) The UE launches apps, and if the network conditions (bandwidth & latency) are favorable, the execution is offloaded; 4) the INFv network system forwards offloading request to an available backend, and finally, 5) the backend executes requests.

Our experimental setup is shown in Fig. 3. Both INFv’s network subsystem and the MQ system are run within docker containers. The selected MQ protocol was MQTT (optimized for mobile devices) and our messages are encoded with protocol buffers buffers [2016]. The app profiler utilizes hardware-level virtualization (Android x86) to profile the user apps. Power measurements are taken with a Monsoon Power Monitor and latency is emulated using TC NetEm (Traffic Control Network Emulation). In all experiments, phones are factory reset with Android 4.4, no Google account, and INFv pre-installed. In Section 4.2 and 4.1 we used a Galaxy S2 (i9100) and offload to an Intel Q6600 (4GB of RAM and a 100 Mbit fiber connection). Section 4.4 uses a more up-to-date setup: a Galaxy S5 and an Intel i7 4790K (16GB RAM, 300 Mbit fiber). Additionally, while 3/4G setups use real mobile networks, in WiFi the UE and backend share a common WiFi access point.

The offloaded apps were Linpack lp [2016], FaceDetect, QuickEditor qui [a], and QuickPhoto qui [b]. Linpack is computationally intensive and FaceDetect has high state transmission costs and have been widely used to benchmark MCO performance Kosta et al. [2012]; Zhang et al. [2012]; Shi et al. [2014]. QuickEditor and QuickPhoto both use Google Drive, and although not computationally intensive, they exemplify 1) how INFv can target common functionality across apps and 2) how INFv can provide functionality otherwise absent from a device. Each app is standalone, i.e., no client/server counterpart, and has no special design decisions or implementation to facilitate code offloading.

4.1 Impact of Code Partitions

Linpack measures a system’s floating point computing power by randomly generating a dense matrix of floating point numbers on $[-1,1]$ and performing LU decomposition for $M$ cycles (iterations). We added support for multi-threading, a feature many previous automated offloading architectures do not or only partially support Cuervo et al. [2010]; Chun et al. [2011]; Kemp et al. [2010]. We experimented using two different offloading partition sets. For the first (All), GN provides two partitions that minimize network communication: 1) a partition that interacts with the UI, therefore invalid for offloading, and 2) a valid partition that performs computation, i.e., all computation is performed remotely and there is almost no communication costs as threading occurs on the backend. The second (Lin) represents the worst case scenario by offloading individual classes (i.e. $N$ partitions where $N$ is the number of classes in the app). For Lin, only the Linpack class and calculations are offloaded, and thus, threading, as well as pre- and post-cycle processing, occurs on the client with a higher communication penalty. For this experiment (and the next) we perform unconditional offloading (i.e. no runtime decisions) to depict the partition trade-offs.

Fig. 4A plots the speed up for the Linpack benchmark running 4 threads, showing that INFv provides the expected computational performance benefits of code offloading. For the All partition, INFv achieves a speed up over 4.0x on both WiFi and 3G since there is almost no communication. When the more restrictive partitioning (Lin) is used, the WiFi experiment achieves a 1.57x speed up. However, the 3G experiment shows reduced performance (0.73 speed up), due to the 3G latency and the high frequency of communication (up to 40 messages, 10 per thread, for each cycle).

Next, Fig. 4B plots the distribution of power consumption for both partition sets. The local execution with 2 and 4 threads had a median power consumption of 3 and 6W, respectively; for offloaded executions it was below 2W. For the WiFi All partition, the quartiles show a tight distribution of power consumption and, since the UE was mostly idle with a mean energy consumption close to the observed Android background activity (dashed line), the overall energy consumption was reduced by up to 4 times.

Offloading without modularity optimizing can result in reduced performance in high latency networks despite the power consumption reduction. Taken together, Fig. 4A and 4B, show that INFv’s MCO provides real benefits but also demonstrates the trade-offs of different partitionings.

4.2 Cost of State Synchronization

The FaceDetect (FD) app, which finds the coordinates of faces in images, is useful for measuring the trade-offs between data transfer and computational speed up. The offloaded partition contains the classes interacting with the face detection APIs and the client device just sends the underlying Android Bitmap object and receives an array of coordinates. The execution time and power were measured from when the app starts up to when the results are drawn on the screen. To test the impact of data transfer, we use multiple images from the AT&T face database fac [2016] ranging in size from 0.02MB to 1.2MB.

Fig. 4D plots the execution time as a function of image size and we can see that local execution is faster for images $\leq$ 0.07MB. This is because detecting faces in small images does not have enough computational cost to outweigh the communication costs of offloading. For larger images, the WiFi communication costs are compensated by the VM processing speed. For example, with the 1.2MB image, using WIFI has an execution speed 1.45x faster, but for a 0.2MB image offloading was 3.86x slower. Unfortunately offloading was never justified (in terms of execution time) over 3G due to its high latency.

Fig. 4C plots the total Joules FaceDetect consumed, not counting baseline OS consumption. We note that for small images ( $\leq$ 0.25MB), local execution results in less power consumption than 3G, although after this point, offloading over 3G saves energy. Offloading over WiFi results in lower power consumption than local execution for all but the smallest image in the dataset. For example, an image with 1.2MB consumes 1.9x less battery when offloaded via 3G connectivity and 6.9x less battery if offloaded via WiFi. In the case of WiFi, the decreased execution time due to the VM processing power is the significant factor in energy savings. The 3G energy consumption is higher than WiFi for two reasons: 1) there is additional radio overhead for 3G and 2) the total execution time is larger due the higher RTT.

The main take away from these experiments is that, as in the Linpack experiments, INFv’s offloading engine provides both computational and energy benefits. However, if there is substantial interaction between local and offloaded objects that involves passing a lot of data, it can result in a net loss of performance. In Section 4.4 we show how profiling metadata can be used to prevent such scenarios.

4.3 Responsiveness to Jitter

Next, we study how INFv’s control strategies respond to sudden increases in workload (i.e., jitter). We setup a toy network topology composed of a client, two routers ( $n_{1}$ and $n_{2}$ ), and a server (acting as a catch-all for requests not handled by $n_{1}$ or $n_{2}$ ); i.e., client $\rightarrow$ router $n_{1}$ $\rightarrow$ router $n_{2}$ $\rightarrow$ server. We simulate the client’s request flow at a stable rate of $\lambda=1000/s$ but inject two instances of jitter at $6\lambda$ for 10ms at time 40ms ( $j_{1}$ ) and 70ms ( $j_{2}$ ). Fig. 5 plots the workload over time when the routers use a passive strategy (PAS $n_{1}$ and PAS $n_{2}$ in the first two rows) vs. a proactive strategy (PRO $n_{1}$ and PRO $n_{2}$ in the second two rows). The two right most columns zoom in to the period when $j_{1}$ and $j_{2}$ have just occurred.

For passive control, PAS $n_{1}$ takes most of the load (88%), exhibiting consistent behavior for both $j_{1}$ and $j_{2}$ . However, the proactive routers show an interesting variation. For $j_{1}$ , although PRO $n_{1}$ successfully offloads 31.8% of load to PRO $n_{2}$ , it also experiences high load for a period of 2ms (row 3, column 2). After $j_{1}$ , however, PRO $n_{1}$ enters a conservative mode. Thus, when $j_{2}$ arrives, the load curve on PRO $n_{1}$ is much flatter with no clear peak appearing at all. Instead, it proactively offloads more tasks to PRO $n_{2}$ , resulting in PRO $n_{2}$ absorbing about 36.7% of the load from $j_{2}$ . Between 80 and 130ms we see some load still transferred from PRO $n_{1}$ to PRO $n_{2}$ because PRO $n_{1}$ remains in conservative mode. After 130ms, PRO $n_{1}$ returns to normal mode and the load on PRO $n_{2}$ goes to 0.

By checking the second and third columns, we are able to gain an even better understanding on what actually happens when jitter arrives. For both $j_{1}$ and $j_{2}$ , the proactive strategy responds faster; i.e., $n_{2}$ ’s load curve rises earlier and faster. For $j_{2}$ , the proactive strategy responds even faster since PRO $n_{2}$ is already in conservative mode: PAS $n_{2}$ only starts taking load at 74 ms, 4 ms later after the $j_{2}$ arrives at PAS $n_{1}$ . The major take away here is that INFv is highly responsive to workload jitter due to its network subsystem.

4.4 Runtime decisions

Finally we evaluate how INFv behaves in 4G, with and without additional induced latency, to stress test INFv’s runtime decisions. Fig. 6 A) and B) show the observed energy consumption distribution with and without INFv, for 20 experiments using a 1.2MB image fac . In C) we depict the power over time for three of these experiments. While the execution without INFv (continuous line), consumed over 2 W for most of the experiment time ( $\mu\approx 11.57s$ ), the execution with INFv (dashed line), is mostly comprised of two spikes in energy consumption. These two spikes represent two distinct phases: 1) image transmission and 2) retrieving and displaying the results; and are dependent on the current connectivity state, e.g., transition to a 4G connected state. In A) and B) we show the power distribution for the 20 experiments, with and without INFV, respectively. The power distribution in A) has an higher variance but the majority of the observations are lower than 2W ( $\mu\approx 1.35W$ and within a 95% confidence interval of 0.139W ) and its execution is up to 2,8 times faster ( $\mu\approx 2.3$ times) than the execution without INFv, which results in over 66% energy savings over the 20 executions (idle time excluded). While UEs are becoming more powerful, so are commodity processors and mobile networks, demonstrated by the execution speed improvements in this experiment compared to the 3G experiments (Section 4.2).

Fig. 7 shows the impact of latency on execution time and energy consumption of FD over 120 executions. The observed latency consists of the latency induced on the backend network interface (TC in Fig. 3) and the real 4G latency.777Note that while 300ms might be unusual in 4G, it is quite common in 3G. The energy consumption is always lower when offloading, we see a reduction in the savings from close to 70% less energy (no induced latency) to 40% due to higher latency. A major takeaway here has to do with in-network vs. cloud deployment: the overall LTE round-trip time (RTT) for offloading to a cloud instance is often over 100ms888We measured a mean latency of 109.4 and 112.2ms from a mobile device with LTE in Barcelona, Spain to Amazon EC2 regions with the lowest latencies (Frankfurt, eu-central-1 and Ireland, eu-west-1).; quite high compared to hosting the functionality at the ISP’s Radio Access Network (RAN) where devices see only 15-45ms RTT Laner et al. [2012]999we confirmed the lower bound LTE latency values using an USRP transceiver B [21] and a conservative software-based LTE protocol stack ope .. I.e., deploying to the RAN can reduce latency by 58.9% to 86.7% (over 90ms difference). Since our results indicate that a 70ms variation in latency can incur up to 24.5% and 21.6% increase in the average execution time and energy consumption, respectively, hosting functionality in-network brings clear benefits. Further, reducing latency via in-network deployment also increases the set of viable candidate apps for offloading to include those that are particularly latency sensitive (e.g., games).

Finally in Fig. 8 we plot the execution time versus consumed energy for the FD app with the INFv offloading decision model. The RM keeps a rolling window of observed latency (last 3) to the server, which is updated whenever there is no offloading communication (i.e., no threads paused or pending messages) for $>30s$ and the screen is on. There were 260 experiments over a $>7$ hour period (100ms periods). We vary the latency (from 0–600ms, with 50ms steps) 30s after each experiment starts. INFv decides whether to offload the face detection partition based on the connection properties at runtime (latency and bandwidth vs. transmitted state) and the profiling estimates (i.e., energy and execution time). Therefore, when executed locally, its behavior should resemble the experiments in Fig. 6A.

Note that the majority (86.7%) of local execution times fell within one $\sigma$ (the dark gray area in the graph) from those of the experiments in Fig. 6A, and over 96% of the observed values within two $\sigma$ (light gray). Moreover, 99% of local executions’ energy consumptions were within one $\sigma$ .

There were 4 executions that took longer than the local experiments, however, they are an artifact of the periodicity of the latency measurements: INFv was not able to detect the increase in latency prior to making an offloading decision. This is important because such cases can occur due to changes in connectivity (e.g., 4G to 3G) or ISP service degradation. While such impact can be reduced by increasing measurement periodicity, the first is already detected by monitoring changes in the default network interface.

Ultimately, it is clear that, even in the presence of high latency variance, INFv detected when computation should not be offloaded and energy consumption was greatly reduced for all offloaded experiments, performing faster than the worst local execution 98.5% of the time and faster than all local executions 93.2% of the time.

4.5 Offloading Popular Libraries and Apps

To evaluate INFv’s support for the most popular libraries and apps, we offloaded an ubiquitous library and performed a partitioning analysis of the top free apps in the GP market.

First, we studied the top 2.5K apps and found that 76.4% of apps use Google Mobile Services (GMS). We chose two applications that use a common GMS service – Google Drive (GD). The first, QuickEditor qui [a], is a text editor that allows users to create, open, and edit text files stored in their GD. The second, QuickPhoto qui [b], uses the device’s camera to take pictures and upload them to GD. To support both apps in a device without GMS installed, 26 GMS classes were offloaded, none of them app specific (code available at gms ). With our in-network solution the number of extra network hops to provide GMS functionality are minimal since GD calls already trigger communication which is forwarded through the RAN. The network communication overhead is also quite minimal: around 46, 50, and 10B, respectively, to create a class, invoke a method, and receive a response.

Second, to address the concern of how many apps are actually offloadable we inspected the top 24 apps regarding INFv’s partitioning and validation mechanisms (Section 3.2.2). Our first finding was that only 6% of app classes extend UI or hardware related classes. While all other classes could potentially be offloaded, we want to minimize the communication between local and offloaded code. To this end, we used our dynamic analysis platform to execute the apps for 5 minutes each and extract their runtime call graphs to discover valid offloading partitions (i.e., GN communities). Previous work Choudhary et al. [2015] has shown this to be a reasonable interval to achieve high coverage. Classes that communicate often are likely to share a purpose (e.g., handle UI interaction) and so, building communities based on their communication should separate distinct functionality. In fact, Figure 9 shows how increasing the number of partitions increases the amount of offloadable code due to this separation of purposes. At 30 GN communities, all but a single app have between 24%-68% of their code suitable for offloading (hundreds to thousands of classes). If we use the Louvain Blondel et al. [2008] algorithm to pick the optimal partitioning (based on modularity), we see find that the median number of partitions per app is 17 and that their offloadable code ranges from 7 to 57% ( $\tilde{x}\approx 24\%$ ) with only two apps below 10%. While the benefits of offloading such partitions are dependent on the runtime environment (e.g., state, network connectivity, etc.) these results indicate that our offloading strategy can be applied to more popular and complex apps with huge real-world user bases.

5 Deployment considerations

To achieve low latency when serving an offloading request, functionality should optimally be already present in nodes and it should perform well under varying load. This raises two questions: what is the storage cost of hosting the most popular apps? How well does the network subsystem perform under high-load while reducing the offloading latency?

5.1 Storage requirements

In this section we give some intuition behind the idea of network functionality caching and empirically show its feasibility via a study of over 20K of the most popular apps on the Google Play Store in February 2016. As Fig. 10a shows, the market app packages (apk) are quite small ( $\mu\approx 15.3MB$ ) even considering the actual install size ( $\mu\approx 23.9MB$ ). Only a fraction of the apk is actually app functionality – dalvik executable (dex) –, which contains the app classes. For medium/big sized apps (78% of the apps), when extracted from the apk, the dex size is on average 34.1% of the apk size ( $\tilde{x}\approx 25.3$ %). Our dataset accounts for over 81% ( $\approx 15,000$ apps Viennot et al. [2014]) of all Google Play downloads and in total these apps require an aggregated storage of 307 GB (apk size), which is a manageable size.

The large overlap in app functionality can be exploited to intelligently cache functionality in the network. To understand why and how this is possible, a brief overview of Android app organization and packaging is necessary. Android apps organize functions into packages which are further identified with a hierarchical naming scheme similar to domain names. Intuitively, the hierarchical naming could facilitate us in identifying shared functionality across apps. Unfortunately, naming in Android can be affected by obfuscation, a security mechanism that remaps functionality and package names. It makes it hard, if not impossible, to detect similarities based on simple name comparisons. E.g., a package “com.apple” from app A and a package “com.google” from app B can both be renamed to “a.a”. But, since in Android obfuscation (i.e., Proguard) names are attributed alphabetically, we found that it is possible to detect if a package name is obfuscated or not based simply on name length. We found that 37.5% of apps have potentially obfuscated packages with name “a” (at any given depth), while 82% of the apps have at least one class named “a”. By studying the name distribution with a single character at any given depth, we have found that the majority of obfuscated package names have names between “a” and “p” (8% of all packages). Thus, we filtered all package names including names with just one character, at any given level. Unfortunately obfuscated class names do not follow a similar distribution and such filter would exclude an high number of non-obfuscated classes. The remaining package names were used to estimate functionality similarity. If the same package name exists in two different apps we consider the functionality within this packages to be similar. Obviously, looking at low depth names ( $N<3$ ), such as the first depth packages (e.g., “a” in “a.b.c”) which contain all other packages and functionality, many apps are likely to share the same name and therefore, most of the functionality will be considered similar (false positives). Looking deeper into the package hierarchy, however, can greatly reduce the rate of false positives. Fig. 10b shows the percentage of unique classes per app based on a comparison on their first N package names. The number of unique classes are calculated as the total number of classes in the app minus the number of classes belonging to non-obfuscated package names that also exist in at least one other app. Considering that the package name distribution has $\mu\approx 4.7$ , $\tilde{x}\approx 5.0$ and $\sigma\approx 1.6$ , and that most package names have a depth between 4 (Q1) and 6 (Q3), even if we do a conservative comparison of packages based on their first 5 name depths (50th percentile), we can see that only 47% (mean) of apps’ classes are unique. Note that while higher values ( $N\geq 8$ ) reduce false positives, they also increase the number of false negatives as classes within packages with smaller depths are considered as unique. For a depth of 4, which is likely to include the name of the app and respective developer (e.g., “com.facebook.katana.app”), 75% (median) of apps’ functionality is common with at least one other app. The analysis thus indicates that there is a substantial app’s functionality overlap in the Android ecosystem.

Extracting the app classes (classes.dex), the storage requirements to host over 81% of the most downloaded apps, are already reduced by 74% ( $\approx 80$ GB). If common functionality is co-located, the total reduction can be up to 93.5% ( $<20$ GB, based on the 4th depth median overlap). While our app analysis platform requires at least the full apk files for analysis, the class overlap provides a unique opportunity for deployment in a modern ISP network, allowing INFv to exploit the network topology and ensure that functionality is available as close to users as possible.

5.2 Scalability to Workload

In this section we perform a large network simulation using a realistic ISP topology (Exodus Spring et al. [2002]) with Icarus Saino et al. [2014]. We use a Poisson request stream with $\lambda=1000/s$ for the arrival rate; increasing the request rate introduces more load to the network. To simplify the presentation, we assume CPU is the first bottleneck in the system for computationally intensive apps and all experiments are performed $>$ 50 times to ensure the reported results are representative.

Fig. 11 shows the results of using three strategies (one for each row) with three workloads (one for each column). There are 375 nodes in the network and we randomly select 100 nodes of degree one as access points to receive user requests. The average load of each node is normalized by its CPU capacity and only the top 50 heaviest loads are presented in a decreasing order. By examining the first column, we can see all three strategies have identical behaviors when the network is underutilized with a workload of $\lambda$ . The heaviest loaded node uses only about 60% of its total capacity. However, as we increase the load to $4\lambda$ and $8\lambda$ , the three strategies exhibit quite different behavior. The experiment without a control strategy (“none”) at the first row, the figures remain the similar shape. Since no load is distributed and a node simply drops all requests when being overloaded, it leads to over 54% drop rate with load of $8\lambda$ .

For passive control (second row), we can see both the heads and tails are fatter than “none” control, indicating that more load is absorbed by the network and distributed on different routers. This can also be verified by checking the average load in the figure: given a load of $8\lambda$ , passive control increases the average load of the network from $0.2305$ to $0.3202$ compared to using “none” control. However, there is still over $36\%$ requests dropped at the last hop router. This can be explained by the well-known small-world effect which makes the network diameter short, so there are only limited resources along a random path.

Among all the experiments, a network with proactive control always absorbs all the load, leading to the highest average load in the network which further indicates the highest utilization rate. As the workload increases from $\lambda$ to $8\lambda$ , average load also increases accordingly with the same factor. One very distinct characteristic that can be easily noticed in the last two figures on the third row is that the load distribution has a very heavy tail. This is attributed to the proactive strategy’s capability of offloading to its neighbors. It is also worth pointing out that we only measured the latency of those successfully executed functions, which further explains why “none” control has the smallest latency, since offloaded functionality gets executed immediately at an edge router connected to a client, but more than half the requests are simply dropped and not counted at all. Comparing to the passive strategy, the proactive strategy achieves lower latency. Further investigation on other ISP topologies shows that latency reduction improves with larger networks.

6 Conclusion & Future work

Battery is a huge constraint for mobile devices and the ever growing demands of computation on limited capacity are unlikely to disappear any time soon. Meanwhile, in-network storage and computation resources are growing. We proposed INFv to exploit in-network resources for mobile function offloading. We described its data-driven design and implementation based on a large scale analysis of a real app market. Our evaluation demonstrates that INFv’s non-intrusive offloading technique can significantly improve mobile device’s performance (up to 6.9x energy reduction and 4x faster) and effectively execute functionality in the network while reducing latency. Our analysis shows the potential for functionality caching and popular app offloading, while also providing interesting insights into Android apps’ obfuscation and composition. INFv is a working system and many of its components are open-sourced bro [2016]; dro [2016]; lp [2016].

There are some limitations and caveats which deserve further investigation in the future. First, INFv requires attention to the security and privacy of communication and offloaded functionality. While it provides isolation and detects the use of critical OS APIs, in the future we will consider techniques for detecting vulnerabilities/malware Mariconti et al. [2016]; Onwuzurike et al. [2018] and access to privacy sensitive information Enck et al. [2014]. Although it does not require a custom OS, a one-time root is required, which can be disabled after install. If deployed by an ISP, it can be pre-installed on devices or installed in stores. For other scenarios, either root would be required or, an existing vulnerability could be leveraged to install INFv and secure the device Mulliner et al. [2013].

Second, UI automation might not cover all app code and its generated state might not be representative of real user’s input. A significant amount of work exists on improving coverage Machiry et al. [2013]; Mahmood et al. [2014]; Choudhary et al. [2015]; Almeida et al. [2018] and we are looking to further improve our dynamic analysis by exploiting crowdsourcing platforms company [2015] to test apps with real users. Additionally, an iOS implementation should be possible using similar interception mechanisms cyd and static analysis can be accomplished by dumping the decrypted apps from memory dum . We also plan to explore different interception techniques to better support native code cyd . Finally, we are working to deploy a small-scale real-user test in the coming year to gain valuable feedback to further improve INFv.

Bibliography67

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1B [21] Ub 210. URL https://www.ettus.com/product/details/UB 210-KIT .
2[2] Android dynamic binary instrumentation. URL https://github.com/crmulliner/adbi .
3[3] Android apktool. URL https://code.google.com/p/android-apktool/ .
4[4] Cydia Substrate. URL http://www.cydiasubstrate.com/ .
5[5] Dexguard. URL https://www.guardsquare.com/dexguard .
6[6] Dumpdecrypted. URL https://github.com/stefanesser/dumpdecrypted .
7[7] At&t - the database of faces. URL http://www.cl.cam.ac.uk/research/dtg/attarchive/facesataglance.html .
8[8] Gms replacement for google drive. URL https://github.com/4knahs/gmsreplace .