Unpaired Image-text Matching via Multimodal Aligned Conceptual Knowledge
Recently, the accuracy of image-text matching has been greatly improved by multimodal pretrained models, all of which use millions or billions of paired images and texts for supervised model learning. Different from them, human brains can well match images with texts using their stored multimodal kn...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 23. Juli
|
1. Verfasser: |
Huang, Yan
(VerfasserIn) |
Weitere Verfasser: |
Wang, Yuming,
Zeng, Yunan,
Huang, Junshi,
Chai, Zhenhua,
Wang, Liang |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2024
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence
|
Schlagworte: | Journal Article |