Language-Aware Vision Transformer for Referring Segmentation

Referring segmentation is a fundamental vision-language task that aims to segment out an object from an image or video in accordance with a natural language description. One of the key challenges behind this task is leveraging the referring expression for highlighting relevant positions in the image...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 25. Sept.
1. Verfasser: Yang, Zhao (VerfasserIn)
Weitere Verfasser: Wang, Jiaqi, Ye, Xubing, Tang, Yansong, Chen, Kai, Zhao, Hengshuang, Torr, Philip H S
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article