WWW2021
RETA: A Schema-Aware, End-to-End Solution for Instance Completion in Knowledge Graphs
Paolo Rosso, Dingqi Yang, Natalia Ostapuk, Philippe Cudré-Mauroux
被引用 23 次
摘要
Knowledge Graph (KG) completion has been widely studied to tackle the incompleteness issue (i.e., missing facts) in modern KGs. A fact in a KG is represented as a triplet (ℎ, 𝑟, 𝑡) linking two entities ℎ and 𝑡 via a relation 𝑟 . Existing work mostly consider link prediction to solve this problem, i.e., given two elements of a triplet predicting the missing one, such as (ℎ, 𝑟, ?). This task has, however, a strong assumption on the two given elements in a triplet, which have to be correlated, resulting otherwise in meaningless predictions, such as (Marie Curie, headquarters location, ?). In addition, the KG completion problem has also been formulated as a relation prediction task, i.e., when predicting relations 𝑟 for a given entity ℎ. Without predicting 𝑡, this task is however a step away from the ultimate goal of KG completion. Against this background, this paper studies an instance completion task suggesting 𝑟 -𝑡 pairs for a given ℎ, i.e., (ℎ, ?, ?). We propose an end-to-end solution called RETA (as it suggests the Relation and Tail for a given head entity) consisting of two components: a RETA-Filter and RETA-Grader. More precisely, our RETA-Filter first generates candidate 𝑟 -𝑡 pairs for a given ℎ by extracting and leveraging the schema of a KG; our RETA-Grader then evaluates and ranks the candidate 𝑟 -𝑡 pairs considering the plausibility of both the candidate triplet and its corresponding schema using a newly-designed KG embedding model. We evaluate our methods against a sizable collection of state-of-the-art techniques on three real-world KG datasets. Results show that our RETA-Filter generates of high-quality candidate 𝑟 -𝑡 pairs, outperforming the best baseline techniques while reducing by 10.61%-84.75% the candidate size under the same candidate quality guarantees. Moreover, our RETA-Grader also significantly outperforms state-of-the-art link prediction techniques on the instance completion task by 16.25%-65.92% across different datasets.