Efficient Token-Guided Image-Text Retrieval With Consistent Multimodal Contrastive Training
Image-text retrieval is a central problem for understanding the semantic relationship between vision and language, and serves as the basis for various visual and language tasks. Most previous works either simply learn coarse-grained representations of the overall image and text, or elaborately estab...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 20., Seite 3622-3633
|
1. Verfasser: |
Liu, Chong
(VerfasserIn) |
Weitere Verfasser: |
Zhang, Yuqi,
Wang, Hongsong,
Chen, Weihua,
Wang, Fan,
Huang, Yan,
Shen, Yi-Dong,
Wang, Liang |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
|
Schlagworte: | Journal Article |