Emergent Mind

Indexing Execution Patterns in Workflow Provenance Graphs through Generalized Trie Structures

(1807.07346)
Published Jul 19, 2018 in cs.DB , cs.IR , and cs.LG

Abstract

Over the last years, scientific workflows have become mature enough to be used in a production style. However, despite the increasing maturity, there is still a shortage of tools for searching, adapting, and reusing workflows that hinders a more generalized adoption by the scientific communities. Indeed, due to the limited availability of machine-readable scientific metadata and the heterogeneity of workflow specification formats and representations, new ways to leverage alternative sources of information that complement existing approaches are needed. In this paper we address such limitations by applying statistically enriched generalized trie structures to exploit workflow execution provenance information in order to assist the analysis, indexing and search of scientific workflows. Our method bridges the gap between the description of what a workflow is supposed to do according to its specification and related metadata and what it actually does as recorded in its provenance execution trace. In doing so, we also prove that the proposed method outperforms SPARQL 1.1 Property Paths for querying provenance graphs.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.