WWW2026

Audit?of?Audits for the Web: Bayesian Meta?Evaluation that Yields Interval?Valued, Threshold?Aligned Fairness Claims

Dandan Liu, Aznul Qalid Md Sabri, Lihu Pan, Guangrui Fan

Abstract

For web platforms facing regulatory scrutiny---from content moderation to ad delivery and recommendations---fairness audits routinely disagree due to metric choice, subgroup granularity, sampling variance, and dataset shift. Point estimates yield brittle pass/fail narratives that are hard to defend in governance contexts. We propose a Bayesian audit-of-audits that pools heterogeneous audits---count-based and metric-only---into interval-valued fairness claims with explicit uncertainty and policy-risk tables aligned to practitioner thresholds. The framework unifies classification and exposure metrics, enforces consistency across coarse and intersectional group definitions via soft coherence constraints, and quantifies the Value-of-Information of prospective audits. We also provide heterogeneity diagnostics and leave-one-audit-out sensitivities. Across a synthetic Audit Zoo, a content-moderation case study on CivilComments--WILDS, and an ad-delivery simulation, our meta-evaluator attains near-nominal coverage with narrower intervals and fewer decision flips than per-audit baselines, while integrating partial-information audits.