Hierarchical Representation Network With Auxiliary Tasks for Video Captioning and Video Question Answering
Recently, integrating vision and language for in-depth video understanding e.g., video captioning and video question answering, has become a promising direction for artificial intelligence. However, due to the complexity of video information, it is challenging to extract a video feature that can wel...
Ausführliche Beschreibung
Bibliographische Detailangaben
| Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 01., Seite 202-215
|
| 1. Verfasser: |
Gao, Lianli
(VerfasserIn) |
| Weitere Verfasser: |
Lei, Yu,
Zeng, Pengpeng,
Song, Jingkuan,
Wang, Meng,
Shen, Heng Tao |
| Format: | Online-Aufsatz
|
| Sprache: | English |
| Veröffentlicht: |
2022
|
| Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
|
| Schlagworte: | Journal Article |