End-to-End Open-Vocabulary Video Visual Relationship Detection Using Multi-Modal Prompting

Open-vocabulary video visual relationship detection aims to expand video visual relationship detection beyond annotated categories by detecting unseen relationships between both seen and unseen objects in videos. Existing methods usually use trajectory detectors trained on closed datasets to detect...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2025) vom: 16. Apr.
Auteur principal: Wang, Yongqi (Auteur)
Autres auteurs: Wu, Xinxiao, Yang, Shuo, Luo, Jiebo
Format: Article en ligne
Langue:English
Publié: 2025
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article