Uncertainty-Aware Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization

Weakly-supervised temporal action localization (WTAL) aims to localize the action instances and recognize their categories with only video-level labels. Despite great progress, existing methods suffer from severe action-background ambiguity, which mainly arises from background noise and neglect of n...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 12 vom: 25. Dez., Seite 15896-15911
1. Verfasser: Chen, Mengyuan (VerfasserIn)
Weitere Verfasser: Gao, Junyu, Xu, Changsheng
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM36122950X
003 DE-627
005 20231226084712.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2023.3308571  |2 doi 
028 5 2 |a pubmed24n1204.xml 
035 |a (DE-627)NLM36122950X 
035 |a (NLM)37624714 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Chen, Mengyuan  |e verfasserin  |4 aut 
245 1 0 |a Uncertainty-Aware Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 07.11.2023 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Weakly-supervised temporal action localization (WTAL) aims to localize the action instances and recognize their categories with only video-level labels. Despite great progress, existing methods suffer from severe action-background ambiguity, which mainly arises from background noise and neglect of non-salient action snippets. To address this issue, we propose a generalized evidential deep learning (EDL) framework for WTAL, called Uncertainty-aware Dual-Evidential Learning (UDEL), which extends the traditional paradigm of EDL to adapt to the weakly-supervised multi-label classification goal with the guidance of epistemic and aleatoric uncertainties, of which the former comes from models lacking knowledge, while the latter comes from the inherent properties of samples themselves. Specifically, targeting excluding the undesirable background snippets, we fuse the video-level epistemic and aleatoric uncertainties to measure the interference of background noise to video-level prediction. Then, the snippet-level aleatoric uncertainty is further deduced for progressive mutual learning, which gradually focuses on the entire action instances in an "easy-to-hard" manner and encourages the snippet-level epistemic uncertainty to be complementary with the foreground attention scores. Extensive experiments show that UDEL achieves state-of-the-art performance on four public benchmarks. Our code is available in github/mengyuanchen2021/UDEL 
650 4 |a Journal Article 
700 1 |a Gao, Junyu  |e verfasserin  |4 aut 
700 1 |a Xu, Changsheng  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 45(2023), 12 vom: 25. Dez., Seite 15896-15911  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:45  |g year:2023  |g number:12  |g day:25  |g month:12  |g pages:15896-15911 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2023.3308571  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 45  |j 2023  |e 12  |b 25  |c 12  |h 15896-15911