Modeling Geometric-Temporal Context With Directional Pyramid Co-Occurrence for Action Recognition

In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequenc...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 23(2014), 2 vom: 07. Feb., Seite 658-72
1. Verfasser:	Yuan, Chunfeng (VerfasserIn)
Weitere Verfasser:	Li, Xi, Hu, Weiming, Ling, Haibin, Maybank, Stephen J
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2014
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article Research Support, Non-U.S. Gov't


LEADER	01000naa a22002652 4500
001	NLM251809056
003	DE-627
005	20231224162811.0
007	cr uuu---uuuuu
008	231224s2014 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2013.2291319 \|2 doi
028	5	2	\|a pubmed24n0839.xml
035			\|a (DE-627)NLM251809056
035			\|a (NLM)26270910
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Yuan, Chunfeng \|e verfasserin \|4 aut
245	1	0	\|a Modeling Geometric-Temporal Context With Directional Pyramid Co-Occurrence for Action Recognition
264		1	\|c 2014
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 22.10.2015
500			\|a Date Revised 14.08.2015
500			\|a published: Print
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved
650		4	\|a Journal Article
650		4	\|a Research Support, Non-U.S. Gov't
700	1		\|a Li, Xi \|e verfasserin \|4 aut
700	1		\|a Hu, Weiming \|e verfasserin \|4 aut
700	1		\|a Ling, Haibin \|e verfasserin \|4 aut
700	1		\|a Maybank, Stephen J \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 23(2014), 2 vom: 07. Feb., Seite 658-72 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:23 \|g year:2014 \|g number:2 \|g day:07 \|g month:02 \|g pages:658-72
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2013.2291319 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 23 \|j 2014 \|e 2 \|b 07 \|c 02 \|h 658-72