ASE2024

SLIM: a Scalable and Interpretable Light-weight Fault Localization Algorithm for Imbalanced Data in Microservice

Rui Ren, Jingbang Yang, Linxiao Yang, Xinyue Gu, Liang Sun

1 citation

Abstract

In real-world microservice systems, the newly deployed service - one kind of change service, could lead to a new type of minority fault. Existing state-of-the-art (SOTA) methods for fault localization rarely consider the imbalanced fault classification in change service. This paper proposes a novel method that utilizes decision rule sets to deal with highly imbalanced data by optimizing the F1 score subject to cardinality constraints. The proposed method greedily generates the rule with maximal marginal gain and uses an efficient minorize-maximization (MM) approach to select rules iteratively, maximizing a non-monotone submodular lower bound. Compared with existing fault localization algorithms, our algorithm can adapt to the imbalanced fault scenario of change service, and provide interpretable fault causes which are easy to understand and verify. Our method can also be deployed in the online training setting, with only about 15% training overhead compared to the current SOTA methods. Empirical studies demonstrate the superior performance of our algorithm to existing fault localization algorithms in terms of both accuracy and model interpretability.