Self-Paced Multi-Grained Cross-Modal Interaction Modeling for Referring Expression Comprehension

As an important and challenging problem in vision-language tasks, referring expression comprehension (REC) generally requires a large amount of multi-grained information of visual and linguistic modalities to realize accurate reasoning. In addition, due to the diversity of visual scenes and the vari...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 33(2024) vom: 21., Seite 1497-1507
1. Verfasser: Miao, Peihan (VerfasserIn)
Weitere Verfasser: Su, Wei, Wang, Gaoang, Li, Xuewei, Xi, Li
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article