Context Disentangling and Prototype Inheriting for Robust Visual Grounding

Visual grounding (VG) aims to locate a specific target in an image based on a given language query. The discriminative information from context is important for distinguishing the target from other objects, particularly for the targets that have the same category as others. However, most previous me...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 5 vom: 05. Mai, Seite 3213-3229
Auteur principal: Tang, Wei (Auteur)
Autres auteurs: Li, Liang, Liu, Xuejing, Jin, Lu, Tang, Jinhui, Li, Zechao
Format: Article en ligne
Langue:English
Publié: 2024
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article