Sequence as a Whole : A Unified Framework for Video Action Localization With Long-Range Text Query

Comprehensive understanding of video content requires both spatial and temporal localization. However, there lacks a unified video action localization framework, which hinders the coordinated development of this field. Existing 3D CNN methods take fixed and limited input length at the cost of ignori...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 04., Seite 1403-1418
Auteur principal: Su, Yuting (Auteur)
Autres auteurs: Wang, Weikang, Liu, Jing, Ma, Shuang, Yang, Xiaokang
Format: Article en ligne
Langue:English
Publié: 2023
Accès à la collection:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:Journal Article