End-to-End Open-Vocabulary Video Visual Relationship Detection Using Multi-Modal Prompting

Open-vocabulary video visual relationship detection aims to expand video visual relationship detection beyond annotated categories by detecting unseen relationships between both seen and unseen objects in videos. Existing methods usually use trajectory detectors trained on closed datasets to detect...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2025) vom: 16. Apr.
1. Verfasser: Wang, Yongqi (VerfasserIn)
Weitere Verfasser: Wu, Xinxiao, Yang, Shuo, Luo, Jiebo
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2025
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article