ICLR2025
Optimal Learning of Kernel Logistic Regression for Complex Classification Scenarios
Hongwei Wen, Annika Betken, Hanyuan Hang
Abstract
Problem Setting โข Standard Classification Scenario: we observe i.i.d. data D โ (๐ ๐ , ๐ ๐ ) ๐=1 ๐ drawn from an unknown distribution P, where ๐ ๐ denotes the input and ๐ ๐ represents the output. The goal is to predict the output Y for the input X. โข Complex Classification Scenarios: labeled samples ๐ท ๐ โ (๐ ๐ , ๐ ๐ ) ๐=1 ๐ ๐ drawn from a distribution P, while inference is required for a different distribution Q on the same space. โข Label shift assumption: Two distributions P and Q share the same conditional probability but has different class probabilities, i.e., p(x|y) = q(x|y) but p(y) โ q(y). โข Goals: To estimate the class conditional probability (CCP) estimator , where the class probability ratio ๐ค * โ (๐ค ๐ฆ * ) ๐ฆโ[๐พ] between q(y) and p(y) is given by ๐ค ๐ฆ * := q(y)/p(y), y โ [K]. We then induce the plug-in classifier defined as ๐๐๐๐๐๐ฅ ๐โ[๐พ] เท ๐(๐ฆ|๐ฅ).