ICML2024
CW Complex Hypothesis for Image Data
Yi Wang, Zhiren Wang
被引用 2 次
摘要
We examine both the manifold hypothesis (Bengio et al., 2013) and the union of manifold hypothesis (Brown et al., 2023) , and argue that, in contrast to these hypotheses, the local intrinsic dimension varies from point to point even in the same connected component. We propose an alternative CW complex hypothesis that image data is distributed in "manifolds with skeletons". We support the hypothesis by visualizing distributions of 2D families of synthetic image data, as well as by introducing a novel indicator function and testing it on natural image datasets. One motivation of our work is to explain why diffusion models have difficulty generating accurate higher dimensional details such as human hands. Under the CW complex hypothesis and with both theoretical and empirical evidences, we provide an interpretation that the mixture of higher and lower dimensional components in data obstructs diffusion models from efficient learning.