Generative Model With Coordinate Metric Learning for Object Recognition Based on 3D Models

One of the bottlenecks in acquiring a perfect database for deep learning is the tedious process of collecting and labeling data. In this paper, we propose a generative model trained with synthetic images rendered from 3D models which can reduce the burden on collecting real training data and make th...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 27(2018), 12 vom: 23. Dez., Seite 5813-5826
1. Verfasser:	Wang, Yida (VerfasserIn)
Weitere Verfasser:	Deng, Weihong
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2018
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM28680932X
003	DE-627
005	20231225052606.0
007	cr uuu---uuuuu
008	231225s2018 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2018.2858553 \|2 doi
028	5	2	\|a pubmed24n0956.xml
035			\|a (DE-627)NLM28680932X
035			\|a (NLM)30040643
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Wang, Yida \|e verfasserin \|4 aut
245	1	0	\|a Generative Model With Coordinate Metric Learning for Object Recognition Based on 3D Models
264		1	\|c 2018
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 07.09.2018
500			\|a Date Revised 07.09.2018
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a One of the bottlenecks in acquiring a perfect database for deep learning is the tedious process of collecting and labeling data. In this paper, we propose a generative model trained with synthetic images rendered from 3D models which can reduce the burden on collecting real training data and make the background conditions more realistic. Our architecture is composed of two sub-networks: a semantic foreground object reconstruction network based on Bayesian inference and a classification network based on multi-triplet cost training for avoiding overfitting on the monotone synthetic object surface and utilizing accurate information of synthetic images like object poses and lighting conditions which are helpful for recognizing regular photos. First, our generative model with metric learning utilizes additional foreground object channels generated from semantic foreground object reconstruction sub-network for recognizing the original input images. Multi-triplet cost function based on poses is used for metric learning which makes it possible to train an effective categorical classifier purely based on synthetic data. Second, we design a coordinate training strategy with the help of adaptive noise applied on the inputs of both of the concatenated sub-networks to make them benefit from each other and avoid inharmonious parameter tuning due to different convergence speeds of two sub-networks. Our architecture achieves the state-of-the-art accuracy of 50.5% on the ShapeNet database with data migration obstacle from synthetic images to real images. This pipeline makes it applicable to do recognition on real images only based on 3D models. Our codes are available at https://github.com/wangyida/gm-cml
650		4	\|a Journal Article
700	1		\|a Deng, Weihong \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 27(2018), 12 vom: 23. Dez., Seite 5813-5826 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:27 \|g year:2018 \|g number:12 \|g day:23 \|g month:12 \|g pages:5813-5826
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2018.2858553 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 27 \|j 2018 \|e 12 \|b 23 \|c 12 \|h 5813-5826