Recognition From Web Data : A Progressive Filtering Approach

Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focus...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 27(2018), 11 vom: 12. Nov., Seite 5303-5315
1. Verfasser:	Yang, Jufeng (VerfasserIn)
Weitere Verfasser:	Sun, Xiaoxiao, Lai, Yu-Kun, Zheng, Liang, Cheng, Ming-Ming
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2018
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM286514796
003	DE-627
005	20231225051908.0
007	cr uuu---uuuuu
008	231225s2018 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2018.2855449 \|2 doi
028	5	2	\|a pubmed24n0955.xml
035			\|a (DE-627)NLM286514796
035			\|a (NLM)30010575
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Yang, Jufeng \|e verfasserin \|4 aut
245	1	0	\|a Recognition From Web Data \|b A Progressive Filtering Approach
264		1	\|c 2018
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 31.07.2018
500			\|a Date Revised 31.07.2018
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focuses on image classification and proposes to iterate between filtering out noisy web labels and fine-tuning the CNN model using the crawled web images. Overall, the proposed method benefits from the growing modeling capability of the learned model to correct labels for web images and learning from such new data to produce a more effective model. Our contribution is two-fold. First, we propose an iterative method that progressively improves the discriminative ability of CNNs and the accuracy of web image selection. This method is beneficial toward selecting high-quality web training images and expanding the training set as the model gets ameliorated. Second, since web images are usually complex and may not be accurately described by a single tag, we propose to assign a web image multiple labels to reduce the impact of hard label assignment. This labeling strategy mines more training samples to improve the CNN model. In the experiments, we crawl 0.5 million web images covering all categories of four public image classification data sets. Compared with the baseline which has no web images for training, we show that the proposed method brings notable improvement. We also report the competitive recognition accuracy compared with the state of the art
650		4	\|a Journal Article
700	1		\|a Sun, Xiaoxiao \|e verfasserin \|4 aut
700	1		\|a Lai, Yu-Kun \|e verfasserin \|4 aut
700	1		\|a Zheng, Liang \|e verfasserin \|4 aut
700	1		\|a Cheng, Ming-Ming \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 27(2018), 11 vom: 12. Nov., Seite 5303-5315 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:27 \|g year:2018 \|g number:11 \|g day:12 \|g month:11 \|g pages:5303-5315
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2018.2855449 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 27 \|j 2018 \|e 11 \|b 12 \|c 11 \|h 5303-5315