Compositional Attention Networks with Two-Stream Fusion for Video Question Answering

Given a video, Video Question Answering (VideoQA) aims at answering arbitrary free-form questions about the video content in natural language. A successful VideoQA framework usually has the following two key components: 1) a discriminative video encoder that learns the effective video representation...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - (2019) vom: 16. Sept.
1. Verfasser: Yu, Ting (VerfasserIn)
Weitere Verfasser: Yu, Jun, Yu, Zhou, Tao, Dacheng
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2019
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article