ICCV2021

CrackFormer: Transformer Network for Fine-Grained Crack Detection

Huajun Liu, Xiangyu Miao, Christoph Mertz, Chengzhong Xu, Hui Kong

被引用 195 次

摘要

Cracks are irregular line structures that are of interest in many computer vision applications. Crack detection (e.g., from pavement images) is a challenging task due to intensity in-homogeneity, topology complexity, low contrast and noisy background. The overall crack detection accuracy can be significantly affected by the detection performance on fine-grained cracks. In this work, we propose a Crack Transformer network (CrackFormer) for fine-grained crack detection. The CrackFormer is composed of novel attention modules in a SegNet-like encoder-decoder architecture. Specifically, it consists of novel self-attention modules with 1x1 convolutional kernels for efficient contextual information extraction across feature-channels, and efficient positional embedding to capture large receptive field contextual information for long range interactions. It also introduces new scaling-attention modules to combine outputs from the corresponding encoder and decoder blocks to suppress nonsemantic features and sharpen semantic ones. The Crack-Former is trained and evaluated on three classical crack datasets. The experimental results show that the Crack-Former achieves the Optimal Dataset Scale (ODS) values of 0.871, 0.877 and 0.881, respectively, on the three datasets and outperforms the state-of-the-art methods.