Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation

Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer their relationships for a given video. It requires not only a comprehensive understanding of each object scattered on the whole scene but also a deep dive into their temporal motions and interactions. Inherentl...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - PP(2023) vom: 28. Dez.
1. Verfasser: Pu, Tao (VerfasserIn)
Weitere Verfasser: Chen, Tianshui, Wu, Hefeng, Lu, Yongyi, Lin, Liang
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article