Contrastive Video Question Answering via Video Graph Transformer
We propose to perform video question answering (VideoQA) in a Contrastive manner via a Video Graph Transformer model (CoVGT). CoVGT's uniqueness and superiority are three-fold: 1) It proposes a dynamic graph transformer module which encodes video by explicitly capturing the visual objects, thei...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 11 vom: 04. Nov., Seite 13265-13280
|
1. Verfasser: |
Xiao, Junbin
(VerfasserIn) |
Weitere Verfasser: |
Zhou, Pan,
Yao, Angela,
Li, Yicong,
Hong, Richang,
Yan, Shuicheng,
Chua, Tat-Seng |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence
|
Schlagworte: | Journal Article |