AAAI2025

Queries, Representation & Detection: The Next 100 Model Fingerprinting Schemes

Augustin Godinot, Erwan Le Merrer, Camilla Penzo, François Taïani, Gilles Trédan

6 citations

Abstract

The deployment of machine learning models in operational contexts represents a significant investment for any organisation. Consequently, the risk of these models being misappropriated by competitors needs to be addressed. In recent years, numerous proposals have been put forth to detect instances of model stealing. However, these proposals operate under implicit and disparate data and model access assumptions; as a consequence, it remains unclear how they can be effectively compared to one another. Our evaluation shows that a simple baseline that we introduce performs on par with existing state-of-the-art fingerprints, which, on the other hand, are much more complex. To uncover the reasons behind this intriguing result, this paper introduces a systematic approach to both the creation of model fingerprinting schemes and their evaluation benchmarks. By dividing model fingerprinting into three core components -Query, Representation and Detection (QuRD) -we are able to identify ∼ 100 previously unexplored QuRD combinations and gain insights into their performance. Finally, we introduce a set of metrics to compare and guide the creation of more representative model stealing detection benchmarks. Our approach reveals the need for more challenging benchmarks and a sound comparison with baselines. To foster the creation of new fingerprinting schemes and benchmarks, we open-source our fingerprinting toolbox. Companies devote considerable resources (i.e. manpower, funds and energy) to developing efficient and accurate machine learning (ML) models. Many of these models are then deployed in production on online platforms to solve a wide array of business-critical tasks (e.g. recommendations or predictions of all kinds). However, it is well understood that extraction attacks, or simply infrastructure leaks, can allow competitors to access the model architecture (Oh et al. 2018 ), weights (Carlini et al. 2024), and hyperparameters (Wang and Gong 2018). From financial risks, when the attacker can provide the same functionality at a fraction of the cost, to integrity risks, when the attacker could use the stolen model as a step to craft adversarial examples, Model stealing attacks pose great risks for the model developer. Although efforts have been devoted to defend models against extraction attacks (