CVPR2020

Hierarchical Clustering With Hard-Batch Triplet Loss for Person Re-Identification

Kaiwei Zeng, Munan Ning, Yaohua Wang, Yang Guo

摘要

For clustering-guided fully unsupervised person reidentification (re-ID) methods, the quality of pseudo labels generated by clustering directly decides the model performance. In order to improve the quality of pseudo labels in existing methods, we propose the HCT method which combines Hierarchical Clustering with hard-batch Triplet loss. The key idea of HCT is to make full use of the similarity among samples in the target dataset through hierarchical clustering, reduce the influence of hard examples through hard-batch triplet loss, so as to generate high quality pseudo labels and improve model performance. Specifically, ( 1 ) we use hierarchical clustering to generate pseudo labels, (2) we use PK sampling in each iteration to generate a new dataset for training, (3) we conduct training with hard-batch triplet loss and evaluate model performance in each iteration. We evaluate our model on Market-1501 and DukeMTMC-reID. Results show that HCT achieves 56.4% mAP on Market-1501 and 50.7% mAP on DukeMTMC-reID which surpasses state-of-the-arts a lot in fully unsupervised re-ID and even better than most unsupervised domain adaptation (UDA) methods which use the labeled source dataset. Code will be released soon on https://github.com/zengkaiwei/HCT * Corresponding author Figure 1. Hierarchical clustering. Each circle represents a sample, and the step represents the current merging stage. We use a bottom-up method to merge clusters step by step according to the distance between clusters in the current step. datasets, the performance of the model trained on the source domain will significantly decline when it is directly transferred to the target domain. Besides, supervised learning requires a large amount of manually annotated data, which is costly in real life. Therefore, supervised re-ID is difficult to meet the requirement of practical application and people tend to focus on unsupervised re-ID. Recently, people pay more attention on unsupervised re-ID and achieve good progress. Some works focus on unsupervised domain adaptation (UDA). UDA usually needs manually annotated source data and unlabeled target data.