ACL2021

End-to-End Construction of NLP Knowledge Graph

Ishani Mondal, Yufang Hou, Charles Jochim

Abstract

This paper studies the end-to-end construction of an NLP Knowledge Graph (KG) from scientific papers. We focus on extracting four types of relations: evaluatedOn between tasks and datasets, evaluatedBy between tasks and evaluation metrics, as well as coreferent and related relations between the same type of entities. For instance, "F1 score" is coreferent with "F-measure". We introduce novel methods for each of these relation types and apply our final framework (SciNLP-KG) to 30,000 NLP papers from ACL Anthology to build a large-scale KG, which can facilitate automatically constructing scientific leaderboards for the NLP community. The results of our experiments indicate that the resulting KG contains high-quality information. * Work done during internship at IBM Research. 1 https://github.com/sebastianruder/NLP-progress 2 https://paperswithcode.com entities. For instance, "semantic role labeling" is related to "argument identification" and "GENIA Corpus" is related to "NCBI Corpus". To evaluate our end-to-end SciNLP-KG framework, we manually construct a small-scale NLP KG based on our proposed schema (Section 4.2), which contains 85 nodes and 625 links. Experiments show that our system achieves reasonable results for all relation types on this small-scale graph with all possible meaningful links manually annotated. We further apply our framework on 30,000 NLP papers from ACL Anthology to build a largescale NLP KG containing 5,374 nodes and 15,762 relations. We evaluate the quality and coverage of the KG by manually assessing random samples and comparing it with Paperswithcode. We found that our KG contains high-quality information. Overall, the contributions of our work are threefold. First, we propose and design a new schema that represents knowledge about tasks (T), datasets (D) and metrics (M) in the NLP domain. Second, we develop a novel framework (SciNLP-KG) for constructing an NLP KG from the scientific literature in an end-to-end manner. Finally, we automatically build a large-scale NLP KG that contains high-quality information about the Task-Dataset-Metric (TDM) entities. However, our method is generalized in a way that it could be extended to the domains of computer vision or bioinformatics. Our code and datasets are made publicly available at https://github.com/Ishani-Mondal/SciKG to fuel further research.