HCP : A Flexible CNN Framework for Multi-label Image Classification

Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. I...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 38(2016), 9 vom: 01. Sept., Seite 1901-1907
1. Verfasser: Wei, Yunchao (VerfasserIn)
Weitere Verfasser: Xia, Wei, Lin, Min, Huang, Junshi, Ni, Bingbing, Dong, Jian, Zhao, Yao, Yan, Shuicheng
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2016
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM254134734
003 DE-627
005 20231224171725.0
007 cr uuu---uuuuu
008 231224s2016 xx |||||o 00| ||eng c
028 5 2 |a pubmed24n0847.xml 
035 |a (DE-627)NLM254134734 
035 |a (NLM)26513778 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wei, Yunchao  |e verfasserin  |4 aut 
245 1 0 |a HCP  |b A Flexible CNN Framework for Multi-label Image Classification 
264 1 |c 2016 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 20.11.2019 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. In this work, we propose a flexible deep CNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrary number of object segment hypotheses are taken as the inputs, then a shared CNN is connected with each hypothesis, and finally the CNN output results from different hypotheses are aggregated with max pooling to produce the ultimate multi-label predictions. Some unique characteristics of this flexible deep CNN infrastructure include: 1) no ground-truth bounding box information is required for training; 2) the whole HCP infrastructure is robust to possibly noisy and/or redundant hypotheses; 3) the shared CNN is flexible and can be well pre-trained with a large-scale single-label image dataset, e.g., ImageNet; and 4) it may naturally output multi-label prediction results. Experimental results on Pascal VOC 2007 and VOC 2012 multi-label image datasets well demonstrate the superiority of the proposed HCP infrastructure over other state-of-the-arts. In particular, the mAP reaches 90.5% by HCP only and 93.2% after the fusion with our complementary result in [44] based on hand-crafted features on the VOC 2012 dataset 
650 4 |a Journal Article 
700 1 |a Xia, Wei  |e verfasserin  |4 aut 
700 1 |a Lin, Min  |e verfasserin  |4 aut 
700 1 |a Huang, Junshi  |e verfasserin  |4 aut 
700 1 |a Ni, Bingbing  |e verfasserin  |4 aut 
700 1 |a Dong, Jian  |e verfasserin  |4 aut 
700 1 |a Zhao, Yao  |e verfasserin  |4 aut 
700 1 |a Yan, Shuicheng  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 38(2016), 9 vom: 01. Sept., Seite 1901-1907  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:38  |g year:2016  |g number:9  |g day:01  |g month:09  |g pages:1901-1907 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 38  |j 2016  |e 9  |b 01  |c 09  |h 1901-1907