Learning Multi-View Interactional Skeleton Graph for Action Recognition

Capturing the interactions of human articulations lies in the center of skeleton-based action recognition. Recent graph-based methods are inherently limited in the weak spatial context modeling capability due to fixed interaction pattern and inflexible shared weights of GCN. To address above problem...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 6 vom: 09. Juni, Seite 6940-6954
1. Verfasser:	Wang, Minsi (VerfasserIn)
Weitere Verfasser:	Ni, Bingbing, Yang, Xiaokang
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2023
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Beschreibung
Zusammenfassung:	Capturing the interactions of human articulations lies in the center of skeleton-based action recognition. Recent graph-based methods are inherently limited in the weak spatial context modeling capability due to fixed interaction pattern and inflexible shared weights of GCN. To address above problems, we propose the multi-view interactional graph network (MV-IGNet) which can construct, learn and infer multi-level spatial skeleton context, including view-level (global), group-level, joint-level (local) context, in a unified way. MV-IGNet leverages different skeleton topologies as multi-views to cooperatively generate complementary action features. For each view, separable parametric graph convolution (SPG-Conv) enables multiple parameterized graphs to enrich local interaction patterns, which provides strong graph-adaption ability to handle irregular skeleton topologies. We also partition the skeleton into several groups and then the higher-level group contexts including inter-group and intra-group, are hierarchically captured by above SPG-Conv layers. A simple yet effective global context adaption (GCA) module facilitates representative feature extraction by learning the input-dependent skeleton topologies. Compared to the mainstream works, MV-IGNet can be readily implemented while with smaller model size and faster inference. Experimental results show the proposed MV-IGNet achieves impressive performance on large-scale benchmarks: NTU-RGB+D and NTU-RGB+D 120
Beschreibung:	Date Completed 07.05.2023 Date Revised 07.05.2023 published: Print-Electronic Citation Status PubMed-not-MEDLINE
ISSN:	1939-3539
DOI:	10.1109/TPAMI.2020.3032738