Referring Segmentation via Encoder-Fused Cross-Modal Attention Network

This paper focuses on referring segmentation, which aims to selectively segment the corresponding visual region in an image (or video) according to the referring expression. However, the existing methods usually consider the interaction between multi-modal features at the decoding end of the network...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 6 vom: 11. Juni, Seite 7654-7667
1. Verfasser: Feng, Guang (VerfasserIn)
Weitere Verfasser: Zhang, Lihe, Sun, Jiayu, Hu, Zhiwei, Lu, Huchuan
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article