ICSE2023

PILAR: Studying and Mitigating the Influence of Configurations on Log Parsing

Hetong Dai, Yiming Tang, Heng Li, Weiyi Shang

8 citations

Abstract

The significance of logs has been widely acknowledged with the adoption of various log analysis techniques that assist in software engineering tasks. Many log analysis techniques require structured logs as input while raw logs are typically unstructured. Automated log parsing is proposed to convert unstructured raw logs into structured log templates. Some log parsers achieve promising accuracy, yet they rely on significant efforts from the users to tune the parameters to achieve optimal results. In this paper, we first conduct an empirical study to understand the influence of the configurable parameters of six state-of-the-art log parsers on their parsing results on three aspects: 1) varying the parameters while using the same dataset, 2) keeping the same parameters while using different datasets, and 3) using different samples from the same dataset. Our results show that all these parsers are sensitive to the parameters, posing challenges to their adoption in practice. To mitigate such challenges, we propose PILAR (Parameter Insensitive Log Parser), an entropy-based log parsing approach. We compare PILAR with the existing log parsers on the same three aspects and find that PILAR is the most parameter-insensitive one. In addition, PILAR achieves the second highest parsing accuracy and efficiency among all the state-of-the-art log parsers. This paper paves the road for easing the adoption of log analysis in software engineer practices.