Hierarchical Representation Network With Auxiliary Tasks for Video Captioning and Video Question Answering

Recently, integrating vision and language for in-depth video understanding e.g., video captioning and video question answering, has become a promising direction for artificial intelligence. However, due to the complexity of video information, it is challenging to extract a video feature that can wel...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 01., Seite 202-215
1. Verfasser:	Gao, Lianli (VerfasserIn)
Weitere Verfasser:	Lei, Yu, Zeng, Pengpeng, Song, Jingkuan, Wang, Meng, Shen, Heng Tao
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2022
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article

Online verfügbar	Volltext