Beyond Object Proposals : Random Crop Pooling for Multi-Label Image Recognition

Learning high-level image representations using object proposals has achieved remarkable success in multi-label image recognition. However, most object proposals provide merely coarse information about the objects, and only carefully selected proposals can be helpful for boosting the performance of...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 25(2016), 12 vom: 15. Dez., Seite 5678-5688
1. Verfasser: Meng Wang (VerfasserIn)
Weitere Verfasser: Changzhi Luo, Richang Hong, Jinhui Tang, Jiashi Feng
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2016
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Learning high-level image representations using object proposals has achieved remarkable success in multi-label image recognition. However, most object proposals provide merely coarse information about the objects, and only carefully selected proposals can be helpful for boosting the performance of multi-label image recognition. In this paper, we propose an object-proposal-free framework for multi-label image recognition: random crop pooling (RCP). Basically, RCP performs stochastic scaling and cropping over images before feeding them to a standard convolutional neural network, which works quite well with a max-pooling operation for recognizing the complex contents of multi-label images. To better fit the multi-label image recognition task, we further develop a new loss function-the dynamic weighted Euclidean loss-for the training of the deep network. Our RCP approach is amazingly simple yet effective. It can achieve significantly better image recognition performance than the approaches using object proposals. Moreover, our adapted network can be easily trained in an end-to-end manner. Extensive experiments are conducted on two representative multi-label image recognition data sets (i.e., PASCAL VOC 2007 and PASCAL VOC 2012), and the results clearly demonstrate the superiority of our approach
Beschreibung:Date Revised 20.11.2019
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1941-0042
DOI:10.1109/TIP.2016.2612829