Optimizing Latency and Reliability of Pipeline Workflow Applications
Anne Benoit (INRIA Rh\^one-Alpes / LIP Laboratoire d'Informatique du, Parall\'elisme, LIP), Veronika Rehn-Sonigo (INRIA Rh\^one-Alpes / LIP, Laboratoire d'Informatique du Parall\'elisme, LIP), Yves Robert (INRIA, Rh\^one-Alpes / LIP Laboratoire d'Informatique du Parall\'elisme

TL;DR
This paper investigates the complex problem of mapping pipeline applications onto heterogeneous platforms, balancing latency and reliability, and demonstrates its NP-hardness, highlighting the challenges in optimizing such systems.
Contribution
It introduces a formal analysis of the bi-criteria mapping problem for heterogeneous platforms, proving its NP-hardness and discussing the trade-offs between latency and reliability.
Findings
NP-hardness of the bi-criteria mapping problem on heterogeneous platforms
Trade-offs between latency and reliability in pipeline application mapping
Complexity increase due to platform heterogeneity
Abstract
Mapping applications onto heterogeneous platforms is a difficult challenge, even for simple application patterns such as pipeline graphs. The problem is even more complex when processors are subject to failure during the execution of the application. In this paper, we study the complexity of a bi-criteria mapping which aims at optimizing the latency (i.e., the response time) and the reliability (i.e., the probability that the computation will be successful) of the application. Latency is minimized by using faster processors, while reliability is increased by replicating computations on a set of processors. However, replication increases latency (additional communications, slower processors). The application fails to be executed only if all the processors fail during execution. While simple polynomial algorithms can be found for fully homogeneous platforms, the problem becomes NP-hard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
