KDD2025

THEMES: An Offline Apprenticeship Learning Framework for Evolving Reward Functions

Xi Yang, Md. Mirajul Islam, Ge Gao, Min Chi

Abstract

Apprenticeship learning (AL) aims to induce decision-making policies by observing and imitating expert demonstrations.Existing AL approaches typically rely on online interactions and assume that the demonstrations follow a single reward function.Nevertheless, in real-world human-centric applications, policies are usually learned in an offline setting, with the demonstrations driven by multiple reward functions that evolve over time.To address these challenges, we introduce a novel AL framework: Time-aware Hierarchical EM Energy-based Sub-trajectory (THEMES) clustering.We evaluate the effectiveness of THEMES in two challenging human-centric domains -healthcare and education.Our experimental results across multiple datasets demonstrate that THEMES can accurately induce policies, outperforming competitive baselines and ablations, demonstrating its potential for tackling a broad range of complex, real-world human-centric tasks.