VLDB2025
POLARIS: An Interactive and Scalable Data Infrastructure for Polar Science
Yuchuan Huang, Ana Elena Uribe, Kareem Eldahshoury, Youssef Hussein, Grant Ogren, Mohamed F. Mokbel
Abstract
Though polar scientists entertain having huge amounts of publicly available datasets, they face the challenge that working with such data is a cumbersome process that requires downloading tons of unnecessary data and writing various scripts on top of it. This hinders their ability to perform any kind of interactive analysis. This paper presents Polaris; a novel open-source system infrastructure for Polar science that is highly Interactive and Scalable. Polaris is designed based on three observations that distinguish the query workload of polar scientists, namely, all queries are spatio-temporal, not all data are equal, and the large majority of queries are aggregates. Polaris is equipped with a hierarchical spatio-temporal index structure that stores precomputed aggregates for data of interest. Experimental results with a real Polaris prototype and real scientific data show that it achieves highly interactive and scalable data access, enabling interactive analysis of polar science data.