Cross-Modal Retrieval With Noisy Correspondence via Consistency Refining and Mining

The success of existing cross-modal retrieval (CMR) methods heavily rely on the assumption that the annotated cross-modal correspondence is faultless. In practice, however, the correspondence of some pairs would be inevitably contaminated during data collection or annotation, thus leading to the so-...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 33(2024) vom: 02., Seite 2587-2598
1. Verfasser: Ma, Xinran (VerfasserIn)
Weitere Verfasser: Yang, Mouxing, Li, Yunfan, Hu, Peng, Lv, Jiancheng, Peng, Xi
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM369969839
003 DE-627
005 20240403000544.0
007 cr uuu---uuuuu
008 240322s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2024.3374221  |2 doi 
028 5 2 |a pubmed24n1361.xml 
035 |a (DE-627)NLM369969839 
035 |a (NLM)38507381 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Ma, Xinran  |e verfasserin  |4 aut 
245 1 0 |a Cross-Modal Retrieval With Noisy Correspondence via Consistency Refining and Mining 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 02.04.2024 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a The success of existing cross-modal retrieval (CMR) methods heavily rely on the assumption that the annotated cross-modal correspondence is faultless. In practice, however, the correspondence of some pairs would be inevitably contaminated during data collection or annotation, thus leading to the so-called Noisy Correspondence (NC) problem. To alleviate the influence of NC, we propose a novel method termed Consistency REfining And Mining (CREAM) by revealing and exploiting the difference between correspondence and consistency. Specifically, the correspondence and the consistency only be coincident for true positive and true negative pairs, while being distinct for false positive and false negative pairs. Based on the observation, CREAM employs a collaborative learning paradigm to detect and rectify the correspondence of positives, and a negative mining approach to explore and utilize the consistency. Thanks to the consistency refining and mining strategy of CREAM, the overfitting on the false positives could be prevented and the consistency rooted in the false negatives could be exploited, thus leading to a robust CMR method. Extensive experiments verify the effectiveness of our method on three image-text benchmarks including Flickr30K, MS-COCO, and Conceptual Captions. Furthermore, we adopt our method into the graph matching task and the results demonstrate the robustness of our method against fine-grained NC problem. The code is available on https://github.com/XLearning-SCU/2024-TIP-CREAM 
650 4 |a Journal Article 
700 1 |a Yang, Mouxing  |e verfasserin  |4 aut 
700 1 |a Li, Yunfan  |e verfasserin  |4 aut 
700 1 |a Hu, Peng  |e verfasserin  |4 aut 
700 1 |a Lv, Jiancheng  |e verfasserin  |4 aut 
700 1 |a Peng, Xi  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 33(2024) vom: 02., Seite 2587-2598  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnns 
773 1 8 |g volume:33  |g year:2024  |g day:02  |g pages:2587-2598 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2024.3374221  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 33  |j 2024  |b 02  |h 2587-2598