Nocal-Siam : Refining Visual Features and Response With Advanced Non-Local Blocks for Real-Time Siamese Tracking

Siamese trackers contain two core stages, i.e., learning the features of both target and search inputs at first and then calculating response maps via the cross-correlation operation, which can also be used for regression and classification to construct typical one-shot detection tracking framework....

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 30(2021) vom: 13., Seite 2656-2668
1. Verfasser: Tan, Huibin (VerfasserIn)
Weitere Verfasser: Zhang, Xiang, Zhang, Zhipeng, Lan, Long, Zhang, Wenju, Luo, Zhigang
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2021
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Siamese trackers contain two core stages, i.e., learning the features of both target and search inputs at first and then calculating response maps via the cross-correlation operation, which can also be used for regression and classification to construct typical one-shot detection tracking framework. Although they have drawn continuous interest from the visual tracking community due to the proper trade-off between accuracy and speed, both stages are easily sensitive to the distracters in search branch, thereby inducing unreliable response positions. To fill this gap, we advance Siamese trackers with two novel non-local blocks named Nocal-Siam, which leverages the long-range dependency property of the non-local attention in a supervised fashion from two aspects. First, a target-aware non-local block (T-Nocal) is proposed for learning the target-guided feature weights, which serve to refine visual features of both target and search branches, and thus effectively suppress noisy distracters. This block reinforces the interplay between both target and search branches in the first stage. Second, we further develop a location-aware non-local block (L-Nocal) to associate multiple response maps, which prevents them inducing diverse candidate target positions in the future coming frame. Experiments on five popular benchmarks show that Nocal-Siam performs favorably against well-behaved counterparts both in quantity and quality
Beschreibung:Date Revised 08.02.2021
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1941-0042
DOI:10.1109/TIP.2021.3049970