Sketch Augmentation-Driven Shape Retrieval Learning Framework Based on Convolutional Neural Networks

In this article, we present a deep learning approach to sketch-based shape retrieval that incorporates a few novel techniques to improve the quality of the retrieval results. First, to address the problem of scarcity of training sketch data, we present a sketch augmentation method that more closely...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on visualization and computer graphics. - 1996. - 27(2021), 8 vom: 21. Aug., Seite 3558-3570
Auteur principal:	Zhou, Wen (Auteur)
Autres auteurs:	Jia, Jinyuan, Jiang, Wenying, Huang, Chenxi
Format:	Article en ligne
Langue:	English
Publié:	2021
Accès à la collection:	IEEE transactions on visualization and computer graphics
Sujets:	Journal Article Research Support, Non-U.S. Gov't

Description
Résumé:	In this article, we present a deep learning approach to sketch-based shape retrieval that incorporates a few novel techniques to improve the quality of the retrieval results. First, to address the problem of scarcity of training sketch data, we present a sketch augmentation method that more closely mimics human sketches compared to simple image transformation. Our method generates more sketches from the existing training data by (i) removing a stroke, (ii) adjusting a stroke, and (iii) rotating the sketch. As such, we generate a large number of sketch samples for training our neural network. Second, we obtain the 2D renderings of each 3D model in the shape database by determining the view positions that best depict the 3D shape: i.e., avoiding self-occlusion, showing the most salient features, and following how a human would normally sketch the model. We use a convolutional neural network (CNN) to learn the best viewing positions of each 3D model and generates their 2D images for the next step. Third, our method uses a cross-domain learning strategy based on two Siamese CNNs that pair up sketches and the 2D shape images. A joint Bayesian measure is used to measure the output similarity from these CNNs to maximize inter-class similarity and minimize intra-class similarity. Extensive experiments show that our proposed approach comprehensively outperforms many existing state-of-the-art methods
Description:	Date Completed 29.09.2021 Date Revised 29.09.2021 published: Print-Electronic Citation Status PubMed-not-MEDLINE
ISSN:	1941-0506
DOI:	10.1109/TVCG.2020.2975504