Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition

Motion characteristics of human actions can be represented by the position variation of skeleton joints. Traditional approaches generally extract the spatial-temporal representation of the skeleton sequences with well-designed hand-crafted features. In this paper, in order to recognize actions accor...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 25(2016), 7 vom: 01. Juli, Seite 3010-3022
1. Verfasser:	Du, Yong (VerfasserIn)
Weitere Verfasser:	Fu, Yun, Wang, Liang
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2016
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM259334405
003	DE-627
005	20231224190955.0
007	cr uuu---uuuuu
008	231224s2016 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2016.2552404 \|2 doi
028	5	2	\|a pubmed24n0864.xml
035			\|a (DE-627)NLM259334405
035			\|a (NLM)27071176
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Du, Yong \|e verfasserin \|4 aut
245	1	0	\|a Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 13.12.2017
500			\|a Date Revised 10.12.2019
500			\|a published: Print-Electronic
500			\|a Citation Status MEDLINE
520			\|a Motion characteristics of human actions can be represented by the position variation of skeleton joints. Traditional approaches generally extract the spatial-temporal representation of the skeleton sequences with well-designed hand-crafted features. In this paper, in order to recognize actions according to the relative motion between the limbs and the trunk, we propose an end-to-end hierarchical RNN for skeleton-based action recognition. We divide human skeleton into five main parts in terms of the human physical structure, and then feed them to five independent subnets for local feature extraction. After the following hierarchical feature fusion and extraction from local to global, dimensions of the final temporal dynamics representations are reduced to the same number of action categories in the corresponding data set through a single-layer perceptron. In addition, the output of the perceptron is temporally accumulated as the input of a softmax layer for classification. Random scale and rotation transformations are employed to improve the robustness during training. We compare with five other deep RNN variants derived from our model in order to verify the effectiveness of the proposed network. In addition, we compare with several other methods on motion capture and Kinect data sets. Furthermore, we evaluate the robustness of our model trained with random scale and rotation transformations for a multiview problem. Experimental results demonstrate that our model achieves the state-of-the-art performance with high computational efficiency
650		4	\|a Journal Article
700	1		\|a Fu, Yun \|e verfasserin \|4 aut
700	1		\|a Wang, Liang \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 25(2016), 7 vom: 01. Juli, Seite 3010-3022 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:25 \|g year:2016 \|g number:7 \|g day:01 \|g month:07 \|g pages:3010-3022
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2016.2552404 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 25 \|j 2016 \|e 7 \|b 01 \|c 07 \|h 3010-3022