Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
Existing visual question answering methods often suffer from cross-modal spurious correlations and oversimplified event-level reasoning processes that fail to capture event temporality, causality, and dynamics spanning over the video. In this work, to address the task of event-level visual question...
| Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 10 vom: 27. Okt., Seite 11624-11641 |
|---|---|
| 1. Verfasser: | |
| Weitere Verfasser: | , |
| Format: | Online-Aufsatz |
| Sprache: | English |
| Veröffentlicht: |
2023
|
| Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence |
| Schlagworte: | Journal Article |
| Online verfügbar |
Volltext |