SIGMOD2025

Burr: A Benchmark for Ontology Learning from Relational Databases

Lukas Laskowski, Michael Hladik, Jan Portisch, Fabian Panse, Felix Naumann

摘要

Knowledge graphs and ontologies play an essential role in integrating, standardizing, and reasoning about complex data across domains. In recent studies, leveraging knowledge graphs in AI use cases, instead of traditional relational databases, led to quality improvements by up to 38 percentage points. However, learning ontologies from relational databases remains a challenging task due to the impedance mismatch between both modeling concepts. An understanding of which ontology learning system performs best, and why, is missing, as no established benchmark exists. We present BURR, a benchmark for evaluating ontology learning systems from relational databases. To evaluate the ontology learning space, we introduce a novel mapping-based metric and provide a comprehensive benchmark data collection. This collection of 54 scenarios consists of real-world database-ontology mappings, including industry data, and of a micro-benchmark evaluating the behavior of systems in encapsulated scenarios. We demonstrate the applicability of BURR by evaluating widely used ontology learning systems, including traditional rule-based as well as LLM-based approaches, on the benchmark. The results emphasize the current strengths of simple rule-based approaches compared to LLM-based systems, while also highlighting the significant research potential of LLMs in ontology learning.