Search and Result Presentation in Scientific Workflow Repositories
Susan B. Davidson, Xiaocheng Huang, Julia Stoyanovich and, Xiaojie Yuan

TL;DR
This paper introduces a novel approach for searching and presenting results in scientific workflow repositories by modeling workflows with context-free bag grammars, enabling efficient retrieval and meaningful result visualization.
Contribution
It develops a new workflow model using context-free bag grammars and provides polynomial-time algorithms for matching workflows to keyword queries, along with a novel presentation method.
Findings
Efficient algorithms for workflow-query matching
Top-k grammar retrieval from repositories
Effective result visualization with representative parse-trees
Abstract
We study the problem of searching a repository of complex hierarchical workflows whose component modules, both composite and atomic, have been annotated with keywords. Since keyword search does not use the graph structure of a workflow, we develop a model of workflows using context-free bag grammars. We then give efficient polynomial-time algorithms that, given a workflow and a keyword query, determine whether some execution of the workflow matches the query. Based on these algorithms we develop a search and ranking solution that efficiently retrieves the top-k grammars from a repository. Finally, we propose a novel result presentation method for grammars matching a keyword query, based on representative parse-trees. The effectiveness of our approach is validated through an extensive experimental evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Distributed and Parallel Computing Systems
