Weakly-Supervised 3D Spatial Reasoning for Text-Based Visual Question Answering
Text-based Visual Question Answering (TextVQA) aims to produce correct answers for given questions about the images with multiple scene texts. In most cases, the texts naturally attach to the surface of the objects. Therefore, spatial reasoning between texts and objects is crucial in TextVQA. Howeve...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 31., Seite 3367-3382
|
1. Verfasser: |
Li, Hao
(VerfasserIn) |
Weitere Verfasser: |
Huang, Jinfa,
Jin, Peng,
Song, Guoli,
Wu, Qi,
Chen, Jie |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
|
Schlagworte: | Journal Article |