Cross-Modality Person Re-Identification Method with Joint-Modality Generation and Feature Enhancement
Abstract
:1. Introduction
- A lightweight network is used to generate intermediate modalities between visible images and infrared images as auxiliary modalities, enabling the network model to fully extract shared features between visible images and infrared images;
- The feature enhancement method is utilized for optimizing feature extraction, and key feature information in pedestrian images is weighted and enhanced sequentially from both channel and spatial dimensions to enhance the efficiency and representation ability of the network model in utilizing pedestrian features;
- The optimization strategy of centralizing gradient vectors is introduced to improve the generalization ability and training efficiency of the network model;
- The experimental results on SYSU-MM01 and RegDB datasets show the superiority of the proposed method.
2. Related Work
3. Methods
3.1. Framework of the Proposed Method
3.2. Lightweight Modality Generator
3.3. Convolutional Block Attention Module
3.4. Gradient Centralization
3.5. Loss Function
4. Experiment and Analysis
4.1. Datasets and Evaluation Metric
4.2. Experimental Settings
4.3. Comparison with Existing Methods
4.4. Ablation Study
4.4.1. Using Lightweight Modality Generator
4.4.2. Incorporating Convolutional Block Attention Module
4.4.3. Introducing Gradient Centralization
4.4.4. Ablation Experiments to Verify the Effectiveness of 3 Modules
4.5. Visualization Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Yang, L. A Review of Person Re-Identification Based on Deep Learning. China Water Transp. (Second. Half Mon.) 2023, 23, 57–59. [Google Scholar] [CrossRef]
- Wang, S.; Xiao, S. Review of Person Re-identification. J. Beijing Inst. Technol. 2022, 48, 1100–1112. [Google Scholar]
- Liu, T.; Liu, Z. Overview of Cross Modality Person Re-Identification Research. Mod. Comput. Sci. 2021, 135–139. [Google Scholar] [CrossRef]
- Sun, Y.; Wang, R.; Zhang, Q.; Lin, R. A cross-modality person re-identification method for visible-infrared images. J. Beijing Univ. Aeronaut. Astronaut. 2022, 50, 2018–2025. [Google Scholar] [CrossRef]
- Han, C.; Pan, P.; Zheng, A.; Tang, J. Cross-Modality Person Re-Identification Based on Heterogeneous Center Loss and Non-Local Features. Entropy 2021, 23, 919. [Google Scholar] [CrossRef] [PubMed]
- Wu, A.; Zheng, W.S.; Yu, H.X.; Gong, S.; Lai, J. RGB-infrared cross-modality person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Yu, W.; Zhao, Q.; Ji, T. Cross-modal pedestrian re-recognition based on color randomization and full related attention. Foreign Electron. Meas. Technol. 2023, 42, 10–16. [Google Scholar] [CrossRef]
- Fan, X.; Zhang, K.; Zhang, G.; Li, J. Cross-modal person re-identification algorithm based on multi-level joint clustering with feature enhancement. J. Electron. Meas. Instrum. 2024, 38, 94–103. [Google Scholar] [CrossRef]
- Wang, C.; Zhang, C.; Feng, Y.; Ji, Y.; Ding, J. Learning Visible Thermal Person Re-Identification via Spatial Dependence and Dual Constraint Loss. Entropy 2022, 24, 443. [Google Scholar] [CrossRef] [PubMed]
- Zou, Y.; Jiang, M. Multi-granularity cross-modality person re-identification with hetero-center angular constraints. Comput. Eng. Des. 2024, 45, 1210–1217. [Google Scholar]
- Zhang, J.; Chen, G. Visible-infrared Person Re-Identification Via Feature Constrained Learning. Prog. Laser Optoelectron. 2024, 61, 221–228. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, Z.; Zheng, Y.; Chuang, Y.Y.; Satoh, S.I. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 618–626. [Google Scholar]
- Wang, G.; Zhang, T.; Cheng, J.; Liu, S.; Yang, Y.; Hou, Z. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27–28 October 2019; pp. 3623–3632. [Google Scholar]
- Zhang, Z.; Jiang, S.; Huang, C.; Li, Y.; Da Xu, R.Y. RGB-IR cross-modality person ReID based on teacher-student GAN model. Pattern Recognit. Lett. 2021, 150, 155–161. [Google Scholar] [CrossRef]
- Ye, M.; Shen, J.; Lin, G.; Xiang, T.; Shao, L.; Hoi, S.C. Deep learning for person re-identification: A survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 2872–2893. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y. A Review of Cross-Modal Person Re-Identification. Telev. Technol. 2022, 46, 9–11. [Google Scholar] [CrossRef]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Li, D.; Wei, X.; Hong, X.; Gong, Y. Infrared-visible cross-modal person re-identification with an x modality. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 4610–4617. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Yong, H.; Huang, J.; Hua, X.; Zhang, L. Gradient centralization: A new optimization technique for deep neural networks. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part I 16. Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 635–652. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Wang, X.; Girshick, R.; Gupta, A.; He, K. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Radenović, F.; Tolias, G.; Chum, O. Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1655–1668. [Google Scholar] [CrossRef] [PubMed]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning; PMLR: New York, NY, USA, 2015; pp. 448–456. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics; JMLR Workshop and Conference Proceedings; PMLR: New York, NY, USA, 2011; pp. 315–323. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Hermans, A.; Beyer, L.; Leibe, B. In defense of the triplet loss for person re-identification. arXiv 2017, arXiv:1703.07737. [Google Scholar]
- Nguyen, D.T.; Hong, H.G.; Kim, K.W.; Park, K.R. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 2017, 17, 605. [Google Scholar] [CrossRef]
- Chen, Y.C.; Zheng, W.S.; Lai, J.H.; Yuen, P.C. An asymmetric distance model for cross-view feature mapping in person reidentification. IEEE Trans. Circuits Syst. Video Technol. 2016, 27, 1661–1675. [Google Scholar] [CrossRef]
- Ye, M.; Lan, X.; Li, J.; Yuen, P. Hierarchical discriminative learning for visible thermal person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
- Ye, M.; Lan, X.; Wang, Z.; Yuen, P.C. Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Trans. Inf. Forensics Secur. 2019, 15, 407–419. [Google Scholar] [CrossRef]
- Hao, Y.; Wang, N.; Li, J.; Gao, X. HSME: Hypersphere manifold embedding for visible thermal person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 8385–8392. [Google Scholar]
- Ye, M.; Lan, X.; Leng, Q. Modality-aware collaborative learning for visible thermal person re-identification. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 347–355. [Google Scholar]
- Feng, Z.; Lai, J.; Xie, X. Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans. Image Process. 2019, 29, 579–590. [Google Scholar] [CrossRef] [PubMed]
- Choi, S.; Lee, S.; Kim, Y.; Kim, T.; Kim, C. Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Datasets | Number of Pedestrians | Number of Visible Cameras | Number of Infrared Cameras | Number of Visible Images | Number of Infrared Images |
---|---|---|---|---|---|
SYSU-MM01 | 491 | 4 | 2 | 30,071 | 15,792 |
RegDB | 412 | 1 | 1 | 4120 | 4120 |
Method | SYSU-MM01 | RegDB | ||
---|---|---|---|---|
Rank-1 | mAP | Rank-1 | mAP | |
Zero-Padding [6] | 14.80 | 15.95 | 17.75 | 18.90 |
HCML [30] | 14.32 | 16.16 | 24.44 | 20.08 |
BDTR [31] | 27.82 | 28.42 | 34.62 | 33.46 |
HSME [32] | 20.68 | 23.12 | 50.85 | 47.00 |
D2RL [12] | 28.90 | 29.20 | 43.40 | 44.10 |
MAC [33] | 33.26 | 36.22 | 36.43 | 37.03 |
MSR [34] | 37.35 | 38.11 | 48.43 | 48.67 |
AlignGAN [13] | 42.40 | 40.70 | 57.90 | 53.60 |
Hi-CMD [35] | 34.90 | 35.90 | 70.93 | 66.04 |
XIV-ReID [18] | 49.92 | 50.73 | 62.21 | 60.18 |
baseline (AGW) | 47.50 | 47.65 | 70.50 | 80.13 |
baseline + M_GEN + CBAM + GC (ours) 1 | 54.62 | 51.65 | 76.84 | 86.18 |
Method | SYSU-MM01 | RegDB | ||
---|---|---|---|---|
Rank-1 | mAP | Rank-1 | mAP | |
baseline | 47.50 | 47.65 | 70.50 | 80.13 |
baseline + M_GEN | 52.97 | 48.81 | 74.59 | 82.52 |
Method | SYSU-MM01 | RegDB | ||
---|---|---|---|---|
Rank-1 | mAP | Rank-1 | mAP | |
baseline | 47.50 | 47.65 | 70.50 | 80.13 |
baseline + CBAM | 51.39 | 48.47 | 73.51 | 82.23 |
Method | SYSU-MM01 | RegDB | ||
---|---|---|---|---|
Rank-1 | mAP | Rank-1 | mAP | |
baseline | 47.50 | 47.65 | 70.50 | 80.13 |
baseline + GC | 47.59 | 47.88 | 70.81 | 80.50 |
Method | SYSU-MM01 | RegDB | ||
---|---|---|---|---|
Rank-1 | mAP | Rank-1 | mAP | |
baseline | 47.50 | 47.65 | 70.50 | 80.13 |
baseline + GC | 47.59 | 47.88 | 70.81 | 80.50 |
baseline + CBAM | 51.39 | 48.47 | 73.51 | 82.23 |
baseline + M_GEN | 52.97 | 48.81 | 74.59 | 82.52 |
baseline + GC + CBAM | 51.58 | 49.27 | 73.24 | 81.92 |
baseline + GC + M_GEN | 52.91 | 50.04 | 75.41 | 84.54 |
baseline + CBAM + M_GEN | 53.86 | 50.63 | 75.76 | 85.89 |
baseline + GC + CBAM + M_GEN | 54.62 | 51.65 | 76.84 | 86.18 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bi, Y.; Wang, R.; Zhou, Q.; Zeng, Z.; Lin, R.; Wang, M. Cross-Modality Person Re-Identification Method with Joint-Modality Generation and Feature Enhancement. Entropy 2024, 26, 681. https://doi.org/10.3390/e26080681
Bi Y, Wang R, Zhou Q, Zeng Z, Lin R, Wang M. Cross-Modality Person Re-Identification Method with Joint-Modality Generation and Feature Enhancement. Entropy. 2024; 26(8):681. https://doi.org/10.3390/e26080681
Chicago/Turabian StyleBi, Yihan, Rong Wang, Qianli Zhou, Zhaolong Zeng, Ronghui Lin, and Mingjie Wang. 2024. "Cross-Modality Person Re-Identification Method with Joint-Modality Generation and Feature Enhancement" Entropy 26, no. 8: 681. https://doi.org/10.3390/e26080681
APA StyleBi, Y., Wang, R., Zhou, Q., Zeng, Z., Lin, R., & Wang, M. (2024). Cross-Modality Person Re-Identification Method with Joint-Modality Generation and Feature Enhancement. Entropy, 26(8), 681. https://doi.org/10.3390/e26080681