ICSE2024
Are Your Requests Your True Needs? Checking Excessive Data Collection in VPA App
Fuman Xie, Chuan Yan, Mark Huasong Meng, Shao-Ming Teng, Yanjun Zhang, Guangdong Bai
被引用 4 次
摘要
Virtual personal assistants (VPA) services encompass a large number of third-party applications (or apps) to enrich their functionalities. These apps have been well examined to scrutinize their data collection behaviors against their declared privacy policies. Nonetheless, it is often overlooked that most users tend to ignore privacy policies at the installation time. Dishonest developers thus can exploit this situation by embedding excessive declarations to cover their data collection behaviors during compliance auditing. In this work, we present Pico, a privacy inconsistency detector, which checks the VPA app's privacy compliance by analyzing (in)consistency between data requested and data essential for its functionality. Pico understands the app's functionality topics from its publicly available textual data, and leverages advanced GPTbased language models to address domain-specific challenges. Based on the counterparts with similar functionality, suspicious data collection can be detected through the lens of anomaly detection. We apply Pico to understand the status quo of data-functionality compliance among all 65,195 skills in the Alexa app store. Our study reveals that 21.7% of the analyzed skills exhibit suspicious data collection, including Top 10 popular Alexa skills that pose threats to 54,116 users. These findings should raise an alert to both developers and users, in the compliance with the purpose limitation principle in data regulations. CCS CONCEPTS • Security and privacy → Web application security.