Two-Branch Relational Prototypical Network for Weakly Supervised Temporal Action Localization

As a challenging task of high-level video understanding, weakly supervised temporal action localization has attracted more attention recently. With only video-level category labels, this task should indistinguishably identify the background and action categories frame by frame. However, it is non-tr...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 9 vom: 28. Sept., Seite 5729-5746
1. Verfasser: Huang, Linjiang (VerfasserIn)
Weitere Verfasser: Huang, Yan, Ouyang, Wanli, Wang, Liang
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM324628986
003 DE-627
005 20231225190728.0
007 cr uuu---uuuuu
008 231225s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2021.3076172  |2 doi 
028 5 2 |a pubmed24n1082.xml 
035 |a (DE-627)NLM324628986 
035 |a (NLM)33909560 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Huang, Linjiang  |e verfasserin  |4 aut 
245 1 0 |a Two-Branch Relational Prototypical Network for Weakly Supervised Temporal Action Localization 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 05.08.2022 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a As a challenging task of high-level video understanding, weakly supervised temporal action localization has attracted more attention recently. With only video-level category labels, this task should indistinguishably identify the background and action categories frame by frame. However, it is non-trivial to achieve this in untrimmed videos, due to the unconstrained background, complex and multi-label actions. With the observation that these difficulties are mainly brought by the large variations within background and actions, we propose to address these challenges from the perspective of modeling variations. Moreover, it is desired to further reduce the variations, or learn compact features, so as to cast the problem of background identification as rejecting background and alleviate the contradiction between classification and detection. Accordingly, in this paper, we propose a two-branch relational prototypical network. The first branch, namely action-branch, adopts class-wise prototypes and mainly acts as an auxiliary to introduce priori knowledge about label dependencies and be a guide for the second branch. Meanwhile, the second branch, namely sub-branch, starts with multiple prototypes, namely sub-prototypes, to enable a powerful ability of modeling variations. As a further benefit, we elaborately design a multi-label clustering loss based on the sub-prototypes to learn compact features under the multi-label setting. The two branches are associated using the correspondences between two types of prototypes, leading to a special two-stage classifier in the s-branch, on the other hand, the two branches serve as regularization terms to each other, improving the final performance. Ablation studies find that the proposed model is capable of modeling classes with large variations and learning compact features. Extensive experimental evaluations on Thumos14, MultiThumos and ActivityNet datasets demonstrate the effectiveness of the proposed method and superior performance over state-of-the-art approaches 
650 4 |a Journal Article 
700 1 |a Huang, Yan  |e verfasserin  |4 aut 
700 1 |a Ouyang, Wanli  |e verfasserin  |4 aut 
700 1 |a Wang, Liang  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 44(2022), 9 vom: 28. Sept., Seite 5729-5746  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:44  |g year:2022  |g number:9  |g day:28  |g month:09  |g pages:5729-5746 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2021.3076172  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 44  |j 2022  |e 9  |b 28  |c 09  |h 5729-5746