Regionlets for Generic Object Detection

Generic object detection is confronted by dealing with different degrees of variations, caused by viewpoints or deformations in distinct object classes, with tractable computations. This demands for descriptive and flexible object representations which can be efficiently evaluated in many locations....

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 37(2015), 10 vom: 01. Okt., Seite 2071-84
1. Verfasser:	Wang, Xiaoyu (VerfasserIn)
Weitere Verfasser:	Yang, Ming, Zhu, Shenghuo, Lin, Yuanqing
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2015
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Beschreibung
Zusammenfassung:	Generic object detection is confronted by dealing with different degrees of variations, caused by viewpoints or deformations in distinct object classes, with tractable computations. This demands for descriptive and flexible object representations which can be efficiently evaluated in many locations. We propose to model an object class with a cascaded boosting classifier which integrates various types of features from competing local regions, each of which may consist of a group of subregions, named as regionlets. A regionlet is a base feature extraction region defined proportionally to a detection window at an arbitrary resolution (i.e., size and aspect ratio). These regionlets are organized in small groups with stable relative positions to be descriptive to delineate fine-grained spatial layouts inside objects. Their features are aggregated into a one-dimensional feature within one group so as to be flexible to tolerate deformations. The most discriminative regionlets for each object class are selected through a boosting learning procedure. Our regionlet approach achieves very competitive performance on popular multi-class detection benchmark datasets with a single method, without any context. It achieves a detection mean average precision of 41.7 percent on the PASCAL VOC 2007 dataset, and 39.7 percent on the VOC 2010 for 20 object categories. We further develop support pixel integral images to efficiently augment regionlet features with the responses learned by deep convolutional neural networks. Our regionlet based method won second place in the ImageNet Large Scale Visual Object Recognition Challenge (ILSVRC 2013)
Beschreibung:	Date Completed 17.11.2015 Date Revised 10.09.2015 published: Print Citation Status PubMed-not-MEDLINE
ISSN:	1939-3539
DOI:	10.1109/TPAMI.2015.2389830