Human motion tracking by temporal-spatial local gaussian process experts

Human pose estimation via motion tracking systems can be considered as a regression problem within a discriminative framework. It is always a challenging task to model the mapping from observation space to state space because of the high-dimensional characteristic in the multimodal conditional distr...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 20(2011), 4 vom: 28. Apr., Seite 1141-51
1. Verfasser: Zhao, Xu (VerfasserIn)
Weitere Verfasser: Fu, Yun, Liu, Yuncai
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2011
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article Research Support, Non-U.S. Gov't
Beschreibung
Zusammenfassung:Human pose estimation via motion tracking systems can be considered as a regression problem within a discriminative framework. It is always a challenging task to model the mapping from observation space to state space because of the high-dimensional characteristic in the multimodal conditional distribution. In order to build the mapping, existing techniques usually involve a large set of training samples in the learning process which are limited in their capability to deal with multimodality. We propose, in this work, a novel online sparse Gaussian Process (GP) regression model to recover 3-D human motion in monocular videos. Particularly, we investigate the fact that for a given test input, its output is mainly determined by the training samples potentially residing in its local neighborhood and defined in the unified input-output space. This leads to a local mixture GP experts system composed of different local GP experts, each of which dominates a mapping behavior with the specific covariance function adapting to a local region. To handle the multimodality, we combine both temporal and spatial information therefore to obtain two categories of local experts. The temporal and spatial experts are integrated into a seamless hybrid system, which is automatically self-initialized and robust for visual tracking of nonlinear human motion. Learning and inference are extremely efficient as all the local experts are defined online within very small neighborhoods. Extensive experiments on two real-world databases, HumanEva and PEAR, demonstrate the effectiveness of our proposed model, which significantly improve the performance of existing models
Beschreibung:Date Completed 16.08.2011
Date Revised 22.03.2011
published: Print-Electronic
Citation Status MEDLINE
ISSN:1941-0042
DOI:10.1109/TIP.2010.2076820