Cycle-Consistent Weakly Supervised Visual Grounding With Individual and Contextual Representations
Visual grounding, aiming to align image regions with textual queries, is a fundamental task for cross-modal learning. We study the weakly supervised visual grounding, where only image-text pairs at a coarse-grained level are available. Due to the lack of fine-grained correspondence information, exis...
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 11., Seite 5167-5180 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Schlagworte: | Journal Article |
Online verfügbar |
Volltext |