Camera-aware Proxies for Unsupervised Person Re-Identification (2012.10674v2)

Published 19 Dec 2020 in cs.CV

Abstract: This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations. Some previous methods adopt clustering techniques to generate pseudo labels and use the produced labels to train Re-ID models progressively. These methods are relatively simple but effective. However, most clustering-based methods take each cluster as a pseudo identity class, neglecting the large intra-ID variance caused mainly by the change of camera views. To address this issue, we propose to split each single cluster into multiple proxies and each proxy represents the instances coming from the same camera. These camera-aware proxies enable us to deal with large intra-ID variance and generate more reliable pseudo labels for learning. Based on the camera-aware proxies, we design both intra- and inter-camera contrastive learning components for our Re-ID model to effectively learn the ID discrimination ability within and across cameras. Meanwhile, a proxy-balanced sampling strategy is also designed, which facilitates our learning further. Extensive experiments on three large-scale Re-ID datasets show that our proposed approach outperforms most unsupervised methods by a significant margin. Especially, on the challenging MSMT17 dataset, we gain $14.3\%$ Rank-1 and $10.2\%$ mAP improvements when compared to the second place. Code is available at: \texttt{https://github.com/Terminator8758/CAP-master}.

Citations (180)

View on Semantic Scholar

Summary

The paper introduces camera-aware proxies to split clusters by camera view, reducing intra-ID variance in unsupervised person re-identification.
It integrates a proxy-level memory bank and balanced sampling to enhance both intra-camera and inter-camera contrastive learning.
Experimental results show a 14.3% Rank-1 improvement and a 10.2% increase in mAP on MSMT17 compared to other unsupervised methods.

Camera-aware Proxies for Unsupervised Person Re-Identification

In the presented paper, the authors introduce a novel approach for solely unsupervised person re-identification (Re-ID), addressing challenges related to intra-ID variance from camera view changes. Unlike earlier methods that deploy clustering techniques to assign pseudo labels, which are later used iteratively to refine Re-ID models, this approach introduces camera-aware proxies that enhance the generation of pseudo labels while concurrently mitigating intra-ID variance.

Methodology Overview

This approach builds upon clustering-based techniques but incorporates camera-aware proxies to refine pseudo identities. In traditional methods, clustering takes each cluster as a pseudo identity class disregarding significant intra-ID variance triggered by changes in camera perspectives. Addressing this gap, the authors propose splitting clusters into multiple proxies, each representing instances captured by the same camera. This innovative partitioning facilitates the reduction of class variance and the generation of more reliable pseudo labels. The Re-ID model is constructed with both intra-camera and inter-camera contrastive learning components to enhance ID discrimination capabilities within and across different cameras.

The implementation involves constructing a proxy-level memory bank to support model updating. The paper further articulates a proxy-balanced sampling strategy with the intent to boost the learning process.

Experimental Results

When tested against three predominant large-scale Re-ID datasets, the proposed method consistently surpassed most unsupervised methods by a significant margin. Notably, on the MSMT17 dataset, it achieved an impressive $14.3\%$ improvement in Rank-1 accuracy and a $10.2\%$ increment in mean average precision (mAP) against the second best-performing unsupervised method.

Implications and Future Perspectives

The presented technique offers substantial advancements in Re-ID tasks by reducing reliance on supervised learning setups, which often entail time-consuming and costly data annotations. Through the strategic usage of camera-aware proxies and advanced contrastive learning techniques, the model achieves notable improvements in ID discrimination.

This paper reflects advanced developments in unsupervised learning strategies and sets a robust foundation for future investigations in the AI domain, including potential advancements in automated surveillance and security. Exploring these proxies' adaptability to scenarios beyond person identification might yield broader applications in unsupervised learning. Further research could focus on refining the proxy generation mechanism or integrating additional enhancement layers like attention mechanisms to further improve classification performance in complex environments.

PDF Markdown