VLDB2021
A Demonstration of Relic: A System for REtrospective Lineage InferenCe of Data Workflows
Mohammed Suhail Rehman, Silu Huang, Aaron J. Elmore
被引用 2 次
摘要
The ad-hoc, heterogeneous process of modern data science typically involves loading, cleaning, and mutating dataset(s) into multiple versions recorded as artifacts by various tools within a single data science workflow. Lineage information, including the source datasets, data transformation programs or scripts, or manual annotations, is rarely captured, making it difficult to infer the relationships between artifacts in a given workflow retrospectively. We demonstrate Relic, a tool to retrospectively infer the lineage of data artifacts generated as a result of typical data science workflows, with an interactive demonstration that allows users to input artifact files and visualize the inferred lineage in a web-based setting. PVLDB Reference Format: Mohammed Suhail Rehman, Silu Huang, and Aaron J. Elmore. A Demonstration of RELIC: A System for REtrospective Lineage InferenCe of Data Workflows. PVLDB, 14(12): 2795 2798, 2021. doi:10.14778/3476311.3476347