HOP+ : History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation

Recent works attempt to employ pre-training in Vision-and-Language Navigation (VLN). However, these methods neglect the importance of historical contexts or ignore predicting future actions during pre-training, limiting the learning of visual-textual correspondence and the capability of decision-mak...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 7 vom: 03. Juli, Seite 8524-8537
1. Verfasser: Qiao, Yanyuan (VerfasserIn)
Weitere Verfasser: Qi, Yuankai, Hong, Yicong, Yu, Zheng, Wang, Peng, Wu, Qi
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article