REMAP : Multi-layer entropy-guided pooling of dense CNN features for image retrieval

This paper addresses the problem of very large-scale image retrieval, focusing on improving its accuracy and robustness. We target enhanced robustness of search to factors such as variations in illumination, object appearance and scale, partial occlusions, and cluttered backgrounds -particularly imp...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - (2019) vom: 22. Mai
1. Verfasser: Husain, Syed Sameed (VerfasserIn)
Weitere Verfasser: Bober, Miroslaw
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2019
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM297534033
003 DE-627
005 20240229162218.0
007 cr uuu---uuuuu
008 231225s2019 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2019.2917234  |2 doi 
028 5 2 |a pubmed24n1308.xml 
035 |a (DE-627)NLM297534033 
035 |a (NLM)31135362 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Husain, Syed Sameed  |e verfasserin  |4 aut 
245 1 0 |a REMAP  |b Multi-layer entropy-guided pooling of dense CNN features for image retrieval 
264 1 |c 2019 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 27.02.2024 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a This paper addresses the problem of very large-scale image retrieval, focusing on improving its accuracy and robustness. We target enhanced robustness of search to factors such as variations in illumination, object appearance and scale, partial occlusions, and cluttered backgrounds -particularly important when search is performed across very large datasets with significant variability. We propose a novel CNN-based global descriptor, called REMAP, which learns and aggregates a hierarchy of deep features from multiple CNN layers, and is trained end-to-end with a triplet loss. REMAP explicitly learns discriminative features which are mutually-supportive and complementary at various semantic levels of visual abstraction. These dense local features are max-pooled spatially at each layer, within multi-scale overlapping regions, before aggregation into a single image-level descriptor. To identify the semantically useful regions and layers for retrieval, we propose to measure the information gain of each region and layer using KL-divergence. Our system effectively learns during training how useful various regions and layers are and weights them accordingly. We show that such relative entropy-guided aggregation outperforms classical CNN-based aggregation controlled by SGD. The entire framework is trained in an end-to-end fashion, outperforming the latest state-of-the-art results. On image retrieval datasets Holidays, Oxford and MPEG, the REMAP descriptor achieves mAP of 95.5%, 91.5% and 80.1% respectively, outperforming any results published to date. REMAP also formed the core of the winning submission to the Google Landmark Retrieval Challenge on Kaggle 
650 4 |a Journal Article 
700 1 |a Bober, Miroslaw  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g (2019) vom: 22. Mai  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnns 
773 1 8 |g year:2019  |g day:22  |g month:05 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2019.2917234  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |j 2019  |b 22  |c 05