Multimodal Cross-Lingual Summarization for Videos : A Revisit in Knowledge Distillation Induced Triple-Stage Training Method
Multimodal summarization (MS) for videos aims to generate summaries from multi-source information (e.g., video and text transcript), showing promising progress recently. However, existing works are limited to monolingual scenarios, neglecting non-native viewers' needs to understand videos in ot...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 12 vom: 19. Nov., Seite 10697-10714
|
1. Verfasser: |
Liu, Nayu
(VerfasserIn) |
Weitere Verfasser: |
Wei, Kaiwen,
Yang, Yong,
Tao, Jianhua,
Sun, Xian,
Yao, Fanglong,
Yu, Hongfeng,
Jin, Li,
Lv, Zhao,
Fan, Cunhang |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2024
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence
|
Schlagworte: | Journal Article |