VLDB2020

ActiveDeeper: A Model-based Active Data Enrichment System

Liang Zhao, Qingcan Li, Pei Wang, Jiannan Wang, Eugene Wu

被引用 2 次

摘要

Deep Web (e.g., Yelp, IMDb) is an invaluable external data source for enriching a local database with new attributes. In this paper, we present ActiveDeeper, a novel model-driven data enrichment system powered by deep web. ActiveDeeper treats deep web as "a labeler" and uses it to train a data enrichment model. We show that this model-based approach significantly outperforms the state-of-the-art system in realworld scenarios. We implemented ActiveDeeper as a Google Sheets add-on and made a demo video at http://tiny.cc/ activedeeper .