Neighbor-Guided Consistent and Contrastive Learning for Semi-Supervised Action Recognition

Semi-supervised learning has been well established in the area of image classification but remains to be explored in video-based action recognition. FixMatch is a state-of-the-art semi-supervised method for image classification, but it does not work well when transferred directly to the video domain...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 11., Seite 2215-2227
1. Verfasser: Wu, Jianlong (VerfasserIn)
Weitere Verfasser: Sun, Wei, Gan, Tian, Ding, Ning, Jiang, Feijun, Shen, Jialie, Nie, Liqiang
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM355447436
003 DE-627
005 20231226064350.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2023.3265261  |2 doi 
028 5 2 |a pubmed24n1184.xml 
035 |a (DE-627)NLM355447436 
035 |a (NLM)37040248 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wu, Jianlong  |e verfasserin  |4 aut 
245 1 0 |a Neighbor-Guided Consistent and Contrastive Learning for Semi-Supervised Action Recognition 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 19.04.2023 
500 |a Date Revised 19.04.2023 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Semi-supervised learning has been well established in the area of image classification but remains to be explored in video-based action recognition. FixMatch is a state-of-the-art semi-supervised method for image classification, but it does not work well when transferred directly to the video domain since it only utilizes the single RGB modality, which contains insufficient motion information. Moreover, it only leverages highly-confident pseudo-labels to explore consistency between strongly-augmented and weakly-augmented samples, resulting in limited supervised signals, long training time, and insufficient feature discriminability. To address the above issues, we propose neighbor-guided consistent and contrastive learning (NCCL), which takes both RGB and temporal gradient (TG) as input and is based on the teacher-student framework. Due to the limitation of labelled samples, we first incorporate neighbors information as a self-supervised signal to explore the consistent property, which compensates for the lack of supervised signals and the shortcoming of long training time of FixMatch. To learn more discriminative feature representations, we further propose a novel neighbor-guided category-level contrastive learning term to minimize the intra-class distance and enlarge the inter-class distance. We conduct extensive experiments on four datasets to validate the effectiveness. Compared with the state-of-the-art methods, our proposed NCCL achieves superior performance with much lower computational cost 
650 4 |a Journal Article 
700 1 |a Sun, Wei  |e verfasserin  |4 aut 
700 1 |a Gan, Tian  |e verfasserin  |4 aut 
700 1 |a Ding, Ning  |e verfasserin  |4 aut 
700 1 |a Jiang, Feijun  |e verfasserin  |4 aut 
700 1 |a Shen, Jialie  |e verfasserin  |4 aut 
700 1 |a Nie, Liqiang  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 32(2023) vom: 11., Seite 2215-2227  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnns 
773 1 8 |g volume:32  |g year:2023  |g day:11  |g pages:2215-2227 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2023.3265261  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 32  |j 2023  |b 11  |h 2215-2227