The workflow motif: a widely-useful performance diagnosis abstraction for distributed applications
Mania Abdi (1), Peter Desnoyers (1), Mark Crovella (2), Raja R. Sambasivan (3) ((1) Northeastern University, (2) Boston University, (3) Tufts University)

TL;DR
The paper introduces the workflow motif, a new abstraction for diagnosing distributed application performance issues by capturing common request execution patterns, aiding diagnosis and optimization.
Contribution
It formally defines workflow motifs and demonstrates their utility in identifying performance bottlenecks in distributed systems like HDFS.
Findings
Workflow motifs effectively identify key performance issues.
Mining algorithms can extract meaningful motifs from distributed traces.
Application to HDFS shows practical optimization insights.
Abstract
Diagnosing problems in deployed distributed applications continues to grow more challenging. A significant reason is the extreme mismatch between the powerful abstractions developers have available to build increasingly complex distributed applications versus the simple ones engineers have available to diagnose problems in them. To help, we present a novel abstraction, the workflow motif, instantiations of which represent characteristics of frequently-repeating patterns within and among request executions. We argue that workflow motifs will benefit many diagnosis tasks, formally define them, and use this definition to identify which frequent-subgraph-mining algorithms are good starting points for mining workflow motifs. We conclude by using an early version of workflow motifs to suggest performance-optimization points in HDFS.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
