Multi-Domain & Multi-Task Learning for Human Action Recognition
Domain-invariant (view-invariant & modalityinvariant) feature representation is essential for human action recognition. Moreover, given a discriminative visual representation, it is critical to discover the latent correlations among multiple actions in order to facilitate action modeling. To add...
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - (2018) vom: 28. Sept. |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2018
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Schlagworte: | Journal Article |
Zusammenfassung: | Domain-invariant (view-invariant & modalityinvariant) feature representation is essential for human action recognition. Moreover, given a discriminative visual representation, it is critical to discover the latent correlations among multiple actions in order to facilitate action modeling. To address these problems, we propose a multi-domain & multi-task learning (MDMTL) method to (1) extract domain-invariant information for multi-view and multi-modal action representation and (2) explore the relatedness among multiple action categories. Specifically, we present a sparse transfer learning-based method to co-embed multi-domain (multi-view & multi-modality) data into a single common space for discriminative feature learning. Additionally, visual feature learning is incorporated into the multitask learning framework, with the Frobenius-norm regularization term and the sparse constraint term, for joint task modeling and task relatedness-induced feature learning. To the best of our knowledge, MDMTL is the first supervised framework to jointly realize domain-invariant feature learning and task modeling for multi-domain action recognition. Experiments conducted on the INRIA Xmas Motion Acquisition Sequences (IXMAS) dataset, the MSR Daily Activity 3D (DailyActivity3D) dataset, and the Multi-modal & Multi-view & Interactive (M2I) dataset, which is the most recent and largest multi-view and multi-model action recognition dataset, demonstrate the superiority of MDMTL over the state-of-the-art approaches |
---|---|
Beschreibung: | Date Revised 27.02.2024 published: Print-Electronic Citation Status Publisher |
ISSN: | 1941-0042 |
DOI: | 10.1109/TIP.2018.2872879 |