Image Recognition by Predicted User Click Feature With Multidomain Multitask Transfer Deep Network

The click feature of an image, defined as a user click count vector based on click data, has been demonstrated to be effective for reducing the semantic gap for image recognition. Unfortunately, most of the traditional image recognition datasets do not contain click data. To address this problem, re...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 28(2019), 12 vom: 15. Dez., Seite 6047-6062
Auteur principal:	Tan, Min (Auteur)
Autres auteurs:	Yu, Jun, Zhang, Hongyuan, Rui, Yong, Tao, Dacheng
Format:	Article en ligne
Langue:	English
Publié:	2019
Accès à la collection:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:	Journal Article


LEADER	01000caa a22002652 4500
001	NLM298790602
003	DE-627
005	20250225134952.0
007	cr uuu---uuuuu
008	231225s2019 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2019.2921861 \|2 doi
028	5	2	\|a pubmed25n0995.xml
035			\|a (DE-627)NLM298790602
035			\|a (NLM)31265392
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Tan, Min \|e verfasserin \|4 aut
245	1	0	\|a Image Recognition by Predicted User Click Feature With Multidomain Multitask Transfer Deep Network
264		1	\|c 2019
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 09.09.2019
500			\|a Date Revised 09.09.2019
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a The click feature of an image, defined as a user click count vector based on click data, has been demonstrated to be effective for reducing the semantic gap for image recognition. Unfortunately, most of the traditional image recognition datasets do not contain click data. To address this problem, researchers have begun to develop a click prediction model using assistant datasets containing click information and have adapted this predictor to a common click-free dataset for different tasks. This method can be customized to our problem, but it has two main limitations: 1) the predicted click feature often performs badly in the recognition task since the prediction model is constructed independently of the subsequent recognition problem and 2) transferring the predictor from one dataset to another is challenging due to the large cross-domain diversity. In this paper, we devise a multitask and multidomain deep network with varied modals (MTMDD-VM) to formulate image recognition and click prediction tasks in a unified framework. Datasets with and without click information are integrated in the training. Furthermore, a nonlinear word embedding with a position-sensitive loss function is designed to discover the visual click correlation. We evaluate the proposed method on three public dog breed image datasets, and we utilize the Clickture-Dog dataset as the auxiliary dataset that provides click data. The experimental results show that: 1) the nonlinear word embedding and position-sensitive loss function largely enhance the predicted click feature in the recognition task, realizing a 32% improvement in accuracy; 2) the multitask learning framework improves accuracies in both image recognition and click prediction; and 3) the unified training using the combined dataset with and without click data further improves the performance. Compared with the state-of-the-art methods, the proposed approach not only performs much better in accuracy but also achieves good scalability and one-shot learning ability
650		4	\|a Journal Article
700	1		\|a Yu, Jun \|e verfasserin \|4 aut
700	1		\|a Zhang, Hongyuan \|e verfasserin \|4 aut
700	1		\|a Rui, Yong \|e verfasserin \|4 aut
700	1		\|a Tao, Dacheng \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 28(2019), 12 vom: 15. Dez., Seite 6047-6062 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:28 \|g year:2019 \|g number:12 \|g day:15 \|g month:12 \|g pages:6047-6062
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2019.2921861 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 28 \|j 2019 \|e 12 \|b 15 \|c 12 \|h 6047-6062