WWW2024

PASS: Predictive Auto-Scaling System for Large-scale Enterprise Web Applications

Yunda Guo, Jiake Ge, Panfeng Guo, Yunpeng Chai, Tao Li, Mengnan Shi, Yang Tu, Jian Ouyang

10 citations

Abstract

We confront two challenges in the management of a vast and diverse array of online web applications deployed on enterprise-grade autoscaling infrastructure, primarily focused on ensuring Quality of Service (QoS) for large-scale applications and optimizing resource costs. Firstly, reacting to increased load with a response-based approach can temporarily degrade QoS because many web applications need a few minutes to warm up. Therefore, precise workload prediction is critical for predictive scaling. However, our analysis of real-world applications underscores the substantial challenges arising from the limited precision and robustness of existing single prediction algorithms in the context of predictive auto-scaling. Secondly, guaranteeing the QoS of online applications within a costeffective structure is crucial, as it is inherently linked to corporate profitability. Nevertheless, our study shows that mainstream autoscaling methods exhibit various limitations, either being unsuitable for online environments or inadequately ensuring QoS. To address these issues, we introduce PASS, a Predictive Auto-Scaling System tailored for large-scale online web applications in enterprise settings. Our highly robust and accurate prediction framework dynamically integrates and calibrates appropriate prediction algorithms based on the unique characteristics of each application to effectively manage workload diversity. We further establish a performance model derived from online historical logs, enhancing auto-scaling to ensure diverse QoS without adverse impacts on online applications. Additionally, we implement a reactive strategy grounded in queuing theory to promptly address QoS violations resulting from inaccurate predictions or unexpected events. Across a wide spectrum of applications and real-world workloads, PASS outperforms state-of-the-art methods, achieving higher workload prediction accuracy and a superior QoS guarantee rate with less resource cost. KEYWORDS auto-scaling, workload prediction, quality of service, performance model, cloud computing 1 the mutation features are especially prevalent and critical in enterprise-level applications. This is due to a substantial influx of QPS during morning, noon, and evening peak hours, leading to a considerable and undeniable impact.