End-to-End Pre-Training With Hierarchical Matching and Momentum Contrast for Text-Video Retrieval
Lately, video-language pre-training and text-video retrieval have attracted significant attention with the explosion of multimedia data on the Internet. However, existing approaches for video-language pre-training typically limit the exploitation of the hierarchical semantic information in videos, s...
| Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 26., Seite 5017-5030 |
|---|---|
| 1. Verfasser: | |
| Weitere Verfasser: | , , , |
| Format: | Online-Aufsatz |
| Sprache: | English |
| Veröffentlicht: |
2023
|
| Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
| Schlagworte: | Journal Article |
| Online verfügbar |
Volltext |