ICLR2021

Complex Query Answering with Neural Link Predictors

Erik Arakelyan, Daniel Daza, Pasquale Minervini, Michael Cochez

被引用 29 次

摘要

Neural link predictors are immensely useful for identifying missing edges in large scale Knowledge Graphs. However, it is still not clear how to use these models for answering more complex queries that arise in a number of domains, such as queries using logical conjunctions (∧), disjunctions (∨) and existential quantifiers (∃), while accounting for missing edges. In this work, we propose a framework for efficiently answering complex queries on incomplete Knowledge Graphs. We translate each query into an end-to-end differentiable objective, where the truth value of each atom is computed by a pre-trained neural link predictor. We then analyse two solutions to the optimisation problem, including gradient-based and combinatorial search. In our experiments, the proposed approach produces more accurate results than state-of-the-art methods -black-box neural models trained on millions of generated queries -without the need of training on a large and diverse set of complex queries. Using orders of magnitude less training data, we obtain relative improvements ranging from 8% up to 40% in Hits@3 across different knowledge graphs containing factual information. Finally, we demonstrate that it is possible to explain the outcome of our model in terms of the intermediate solutions identified for each of the complex query atoms. All our source code and datasets are available online 1 . Neural link predictors (Nickel et al., 2016) tackle the problem of identifying missing edges in large KGs. However, in many complex domains, an open challenge is developing techniques for answering complex queries involving multiple and potentially unobserved edges, entities, and variables, rather than just single edges. We focus on First-Order Logical Queries that use conjunctions (∧), disjunctions (∨), and existential quantifiers (∃). A multitude of queries can be expressed by using such operators -for instance, the query "Which drugs D interact with proteins associated with diseases t 1 or t 2 ?" can be rewritten as ?D : ∃P.interacts(D, P ) ∧ [assoc(P, t 1 ) ∨ assoc(P, t 2 )], which can be answered via sub-graph matching.