Harmonized Multimodal Learning with Gaussian Process Latent Variable Models

Multimodal learning aims to discover the relationship between multiple modalities. It has become an important research topic due to extensive multimodal applications such as cross-modal retrieval. This paper attempts to address the modality heterogeneity problem based on Gaussian process latent vari...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 43(2021), 3 vom: 29. März, Seite 858-872
1. Verfasser:	Song, Guoli (VerfasserIn)
Weitere Verfasser:	Wang, Shuhui, Huang, Qingming, Tian, Qi
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2021
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article Research Support, Non-U.S. Gov't


LEADER	01000naa a22002652 4500
001	NLM301536511
003	DE-627
005	20231225104833.0
007	cr uuu---uuuuu
008	231225s2021 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TPAMI.2019.2942028 \|2 doi
028	5	2	\|a pubmed24n1005.xml
035			\|a (DE-627)NLM301536511
035			\|a (NLM)31545710
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Song, Guoli \|e verfasserin \|4 aut
245	1	0	\|a Harmonized Multimodal Learning with Gaussian Process Latent Variable Models
264		1	\|c 2021
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 29.09.2021
500			\|a Date Revised 29.09.2021
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Multimodal learning aims to discover the relationship between multiple modalities. It has become an important research topic due to extensive multimodal applications such as cross-modal retrieval. This paper attempts to address the modality heterogeneity problem based on Gaussian process latent variable models (GPLVMs) to represent multimodal data in a common space. Previous multimodal GPLVM extensions generally adopt individual learning schemes on latent representations and kernel hyperparameters, which ignore their intrinsic relationship. To exploit strong complementarity among different modalities and GPLVM components, we develop a novel learning scheme called Harmonization, where latent representations and kernel hyperparameters are jointly learned from each other. Beyond the correlation fitting or intra-modal structure preservation paradigms widely used in existing studies, the harmonization is derived in a model-driven manner to encourage the agreement between modality-specific GP kernels and the similarity of latent representations. We present a range of multimodal learning models by incorporating the harmonization mechanism into several representative GPLVM-based approaches. Experimental results on four benchmark datasets show that the proposed models outperform the strong baselines for cross-modal retrieval tasks, and that the harmonized multimodal learning method is superior in discovering semantically consistent latent representation
650		4	\|a Journal Article
650		4	\|a Research Support, Non-U.S. Gov't
700	1		\|a Wang, Shuhui \|e verfasserin \|4 aut
700	1		\|a Huang, Qingming \|e verfasserin \|4 aut
700	1		\|a Tian, Qi \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on pattern analysis and machine intelligence \|d 1979 \|g 43(2021), 3 vom: 29. März, Seite 858-872 \|w (DE-627)NLM098212257 \|x 1939-3539 \|7 nnns
773	1	8	\|g volume:43 \|g year:2021 \|g number:3 \|g day:29 \|g month:03 \|g pages:858-872
856	4	0	\|u http://dx.doi.org/10.1109/TPAMI.2019.2942028 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 43 \|j 2021 \|e 3 \|b 29 \|c 03 \|h 858-872