Similarity assessment for scientific workflow clustering and recommendation
Zhou Z B, Cheng Z H, Zhu Y Q
This article proposes to identify and recommend scientific workflows for reuse and repurposing. Specifically, a scientific workflow is represented as a layer hierarchy that specifies the hierarchical relations between this workflow, its sub-workflows, and activities. Semantic similarity is calculated between layer hierarchies of workflows. A graph-skeleton based clustering technique is adopted for grouping layer hierarchies into clusters. Barycenters in each cluster are identified, which serve as core workflows in this cluster, for facilitating the cluster identification and workflow ranking and recommendation with respect to the requirement of scientists.