PML-LocNet : Improving Object Localization with Prior-induced Multi-view Learning Network

This paper introduces a new model for Weakly Supervised Object Localization (WSOL) problems where only image-level supervision is provided. The key to solve such problems is to infer the object locations accurately. Previous methods usually model the missing object locations as latent variables, and...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - (2019) vom: 28. Okt.
Auteur principal:	Zhang, Xiaopeng (Auteur)
Autres auteurs:	Yang, Yang, Xiong, Hongkai, Feng, Jiashi
Format:	Article en ligne
Langue:	English
Publié:	2019
Accès à la collection:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:	Journal Article

Description
Résumé:	This paper introduces a new model for Weakly Supervised Object Localization (WSOL) problems where only image-level supervision is provided. The key to solve such problems is to infer the object locations accurately. Previous methods usually model the missing object locations as latent variables, and alternate between updating their estimates and learning a detector accordingly. However, the performance of such alternative optimization is sensitive to the quality of the initial latent variables and the resulted localization model is prone to overfitting to improper localizations. To address these issues, we develop a Prior-induced Multi-view Learning Localization Network (PML-LocNet) which exploits both view diversity and sample diversity to improve object localization. In particular, the view diversity is imposed by a two-phase multi-view learning strategy, with which the complementarity among learned features from different views and the consensus among localized instances from each view are leveraged to benefit localization. The sample diversity is pursued by harnessing coarse-to-fine priors at both image and instance levels. With these priors, more emphasis would go to the reliable samples and the contributions of the unreliable ones would be decreased, such that the intrinsic characteristics of each sample can be exploited to make the model more robust during network learning. PML-LocNet can be easily combined with existing WSOL models to further improve the localization accuracy. Its effectiveness has been proved experimentally. Notably, it achieves 69.3% CorLoc and 50.4% mAP on PASCAL VOC 2007, surpassing the state-of-the-arts by a large margin
Description:	Date Revised 27.02.2024 published: Print-Electronic Citation Status Publisher
ISSN:	1941-0042
DOI:	10.1109/TIP.2019.2947155