Dynamic Spatio-Temporal Graph Reasoning for VideoQA With Self-Supervised Event Recognition

Video question answering (VideoQA) requires the ability of comprehensively understanding visual contents in videos. Existing VideoQA models mainly focus on scenarios involving a single event with simple object interactions and leave event-centric scenarios involving multiple events with dynamically...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 33(2024) vom: 10., Seite 4145-4158
1. Verfasser:	Nie, Jie (VerfasserIn)
Weitere Verfasser:	Wang, Xin, Hou, Runze, Li, Guohao, Chen, Hong, Zhu, Wenwu
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2024
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article

Online verfügbar	Volltext