HCP : A Flexible CNN Framework for Multi-label Image Classification

Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. I...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 38(2016), 9 vom: 01. Sept., Seite 1901-1907
Auteur principal:	Wei, Yunchao (Auteur)
Autres auteurs:	Xia, Wei, Lin, Min, Huang, Junshi, Ni, Bingbing, Dong, Jian, Zhao, Yao, Yan, Shuicheng
Format:	Article en ligne
Langue:	English
Publié:	2016
Accès à la collection:	IEEE transactions on pattern analysis and machine intelligence
Sujets:	Journal Article


LEADER	01000caa a22002652 4500
001	NLM254134734
003	DE-627
005	20250219072622.0
007	cr uuu---uuuuu
008	231224s2016 xx \|\|\|\|\|o 00\| \|\|eng c
028	5	2	\|a pubmed25n0847.xml
035			\|a (DE-627)NLM254134734
035			\|a (NLM)26513778
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Wei, Yunchao \|e verfasserin \|4 aut
245	1	0	\|a HCP \|b A Flexible CNN Framework for Multi-label Image Classification
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 20.11.2019
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. In this work, we propose a flexible deep CNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrary number of object segment hypotheses are taken as the inputs, then a shared CNN is connected with each hypothesis, and finally the CNN output results from different hypotheses are aggregated with max pooling to produce the ultimate multi-label predictions. Some unique characteristics of this flexible deep CNN infrastructure include: 1) no ground-truth bounding box information is required for training; 2) the whole HCP infrastructure is robust to possibly noisy and/or redundant hypotheses; 3) the shared CNN is flexible and can be well pre-trained with a large-scale single-label image dataset, e.g., ImageNet; and 4) it may naturally output multi-label prediction results. Experimental results on Pascal VOC 2007 and VOC 2012 multi-label image datasets well demonstrate the superiority of the proposed HCP infrastructure over other state-of-the-arts. In particular, the mAP reaches 90.5% by HCP only and 93.2% after the fusion with our complementary result in [44] based on hand-crafted features on the VOC 2012 dataset
650		4	\|a Journal Article
700	1		\|a Xia, Wei \|e verfasserin \|4 aut
700	1		\|a Lin, Min \|e verfasserin \|4 aut
700	1		\|a Huang, Junshi \|e verfasserin \|4 aut
700	1		\|a Ni, Bingbing \|e verfasserin \|4 aut
700	1		\|a Dong, Jian \|e verfasserin \|4 aut
700	1		\|a Zhao, Yao \|e verfasserin \|4 aut
700	1		\|a Yan, Shuicheng \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on pattern analysis and machine intelligence \|d 1979 \|g 38(2016), 9 vom: 01. Sept., Seite 1901-1907 \|w (DE-627)NLM098212257 \|x 1939-3539 \|7 nnns
773	1	8	\|g volume:38 \|g year:2016 \|g number:9 \|g day:01 \|g month:09 \|g pages:1901-1907
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 38 \|j 2016 \|e 9 \|b 01 \|c 09 \|h 1901-1907