WWW2026

EcoTune: Edge-Cloud Collaborative Model Adaptation for Budget-Constrained On-Device SLM Personalization

Gong Chen, Mingkai Lin, Xiaobin Hong, Wenzhong Li, Sanglu Lu

Abstract

The rapid growth of web content has spurred the widespread adoption of on-device AI assistants powered by large language models (LLMs). However, deploying and personalizing these assistants in real-world environments remains challenging due to limited annotation budgets and scarce on-device fine-tuning resources. Existing edge–cloud collaboration frameworks typically rely on costly cloud-based supervision or perform full-layer finetuning, leading to inefficiencies in both computation and adaptation. To address these limitations, we propose EcoTune, a budget-constrained framework for efficient edge–cloud collaborative adaptation. EcoTune jointly optimizes representative data selection for cloud annotation and selective on-device model adaptation within a unified closed-loop process. Specifically, it employs a multi-armed bandit–based strategy to identify highvalue user interactions for cloud supervision and a layer importance–driven adaptation mechanism to update only critical components of the small language model (SLM). This coordinated optimization enables dynamic, resource-efficient personalization under stringent annotation and tuning budgets. Experiments on real-world testbeds demonstrate that EcoTune achieves up to 20%-60% reduction in annotation costs and significantly lowers fine-tuning memory consumption compared to state-of-the-art baselines, providing a practical and scalable solution for personalized on-device LLMs.