End-to-End Pre-Training With Hierarchical Matching and Momentum Contrast for Text-Video Retrieval

Lately, video-language pre-training and text-video retrieval have attracted significant attention with the explosion of multimedia data on the Internet. However, existing approaches for video-language pre-training typically limit the exploitation of the hierarchical semantic information in videos, s...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 26., Seite 5017-5030
1. Verfasser:	Shen, Wenxue (VerfasserIn)
Weitere Verfasser:	Song, Jingkuan, Zhu, Xiaosu, Li, Gongfu, Shen, Heng Tao
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2023
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article

Online verfügbar	Volltext