TY - GEN
T1 - Convolutional feature transfer via camera-specific discriminative pooling for person re-identification
AU - Matsukawa, Tetsu
AU - Suzuki, Einoshin
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2020
Y1 - 2020
N2 - Modern Convolutional Neural Networks (CNNs) have been improving the accuracy of person re-identification (re-id) using a large number of training samples. Such a re-id system suffers from a lack of training samples for deployment to practical security applications. To address this problem, we focus on the approach that transfers features of a CNN pre-trained on a large-scale person re-id dataset to a small-scale dataset. Most of the existing CNN feature transfer methods use the features of fully connected layers that entangle locally pooled features of different spatial locations on an image. Unfortunately, due to the difference of view angles and the bias of walking directions of the persons, each camera view in a dataset has a unique spatial property in the person image, which reduces the generality of the local pooling for different cameras/datasets. To account for the camera- and dataset-specific spatial bias, we propose a method to learn camera and dataset-specific position weight maps for discriminative local pooling of convolutional features. Our experiments on four public datasets confirm the effectiveness of the proposed feature transfer with a small number of training samples in the target datasets.
AB - Modern Convolutional Neural Networks (CNNs) have been improving the accuracy of person re-identification (re-id) using a large number of training samples. Such a re-id system suffers from a lack of training samples for deployment to practical security applications. To address this problem, we focus on the approach that transfers features of a CNN pre-trained on a large-scale person re-id dataset to a small-scale dataset. Most of the existing CNN feature transfer methods use the features of fully connected layers that entangle locally pooled features of different spatial locations on an image. Unfortunately, due to the difference of view angles and the bias of walking directions of the persons, each camera view in a dataset has a unique spatial property in the person image, which reduces the generality of the local pooling for different cameras/datasets. To account for the camera- and dataset-specific spatial bias, we propose a method to learn camera and dataset-specific position weight maps for discriminative local pooling of convolutional features. Our experiments on four public datasets confirm the effectiveness of the proposed feature transfer with a small number of training samples in the target datasets.
UR - http://www.scopus.com/inward/record.url?scp=85110537818&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85110537818&partnerID=8YFLogxK
U2 - 10.1109/ICPR48806.2021.9412420
DO - 10.1109/ICPR48806.2021.9412420
M3 - Conference contribution
AN - SCOPUS:85110537818
T3 - Proceedings - International Conference on Pattern Recognition
SP - 8408
EP - 8415
BT - Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 25th International Conference on Pattern Recognition, ICPR 2020
Y2 - 10 January 2021 through 15 January 2021
ER -