VLDB2024
Apache TsFile: An IoT-native Time Series File Format
Xin Zhao, Jialin Qiao, Xiangdong Huang, Chen Wang, Shaoxu Song, Jianmin Wang
被引用 9 次
摘要
The proliferation of the Internet of Things (IoT) has led to an exponential increase in time series data, distributed and applied in various contexts, demanding a dedicated storage solution. Based on our observations and analysis of IoT production systems, we have characterized 3 requirements for time series data: (1) a close association with devices and sensors, (2) continually synchronizing between cloud-edge, and (3) requiring the ability for high ingestion and low latency access on big volume data. Despite the growing trend, current time series database systems lack a standardized file format, and existing open file formats do not adequately leverage the unique characteristics of IoT time series data. In this paper, we introduce Apache TsFile, a specialized file format tailored for IoT time series data. TsFile organizes data by devices, creating indexes based on device-related information. Our experiments demonstrate the efficiency of TsFile in achieving high data ingestion rates, minimizing latency, and optimizing data compactness.