Long Short-Term Relation Transformer With Global Gating for Video Captioning
Video captioning aims to generate a natural language sentence to describe the main content of a video. Since there are multiple objects in videos, taking full exploration of the spatial and temporal relationships among them is crucial for this task. The previous methods wrap the detected objects as...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 24., Seite 2726-2738
|
1. Verfasser: |
Li, Liang
(VerfasserIn) |
Weitere Verfasser: |
Gao, Xingyu,
Deng, Jincan,
Tu, Yunbin,
Zha, Zheng-Jun,
Huang, Qingming |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2022
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
|
Schlagworte: | Journal Article |