Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments

Description-based person re-identification (Re-id) is an important task in video surveillance that requires discriminative cross-modal representations to distinguish different people. It is difficult to directly measure the similarity between images and descriptions due to the modality heterogeneity...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - (2020) vom: 07. Apr.
1. Verfasser:	Niu, Kai (VerfasserIn)
Weitere Verfasser:	Huang, Yan, Ouyang, Wanli, Wang, Liang
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2020
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000caa a22002652 4500
001	NLM308607295
003	DE-627
005	20240229162729.0
007	cr uuu---uuuuu
008	231225s2020 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2020.2984883 \|2 doi
028	5	2	\|a pubmed24n1308.xml
035			\|a (DE-627)NLM308607295
035			\|a (NLM)32275593
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Niu, Kai \|e verfasserin \|4 aut
245	1	0	\|a Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments
264		1	\|c 2020
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 27.02.2024
500			\|a published: Print-Electronic
500			\|a Citation Status Publisher
520			\|a Description-based person re-identification (Re-id) is an important task in video surveillance that requires discriminative cross-modal representations to distinguish different people. It is difficult to directly measure the similarity between images and descriptions due to the modality heterogeneity (the crossmodal problem). And all samples belonging to a single category (the fine-grained problem) makes this task even harder than the conventional image-description matching task. In this paper, we propose a Multi-granularity Image-text Alignments (MIA) model to alleviate the cross-modal fine-grained problem for better similarity evaluation in description-based person Re-id. Specifically, three different granularities, i.e., global-global, global-local and local-local alignments are carried out hierarchically. Firstly, the global-global alignment in the Global Contrast (GC) module is for matching the global contexts of images and descriptions. Secondly, the global-local alignment employs the potential relations between local components and global contexts to highlight the distinguishable components while eliminating the uninvolved ones adaptively in the Relation-guided Global-local Alignment (RGA) module. Thirdly, as for the local-local alignment, we match visual human parts with noun phrases in the Bi-directional Fine-grained Matching (BFM) module. The whole network combining multiple granularities can be end-to-end trained without complex preprocessing. To address the difficulties in training the combination of multiple granularities, an effective step training strategy is proposed to train these granularities step-by-step. Extensive experiments and analysis have shown that our method obtains the state-of-the-art performance on the CUHK-PEDES dataset and outperforms the previous methods by a significant margin
650		4	\|a Journal Article
700	1		\|a Huang, Yan \|e verfasserin \|4 aut
700	1		\|a Ouyang, Wanli \|e verfasserin \|4 aut
700	1		\|a Wang, Liang \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g (2020) vom: 07. Apr. \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g year:2020 \|g day:07 \|g month:04
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2020.2984883 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|j 2020 \|b 07 \|c 04