Single-Frame Supervision for Spatio-Temporal Video Grounding

Spatio-Temporal Video Grounding (STVG) aims at localizing the spatio-temporal tube of a specific object in an untrimmed video given a free-form natural language query. As the annotation of tubes is labor intensive, researchers are motivated to explore weakly supervised approaches in recent works, wh...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 18. Juni
1. Verfasser:	Liu, Kun (VerfasserIn)
Weitere Verfasser:	Qu, Mengxue, Liu, Yang, Wei, Yunchao, Zhe, Wenming, Zhao, Yao, Liu, Wu
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2024
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Online verfügbar	Volltext