Recurrent Convolutional Shape Regression

The mainstream direction in face alignment is now dominated by cascaded regression methods. These methods start from an image with an initial shape and build a set of shape increments based on features with respect to the current estimated shape. These shape increments move the initial shape to the...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 40(2018), 11 vom: 30. Nov., Seite 2569-2582
1. Verfasser:	Wang, Wei (VerfasserIn)
Weitere Verfasser:	Tulyakov, Sergey, Sebe, Nicu
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2018
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Beschreibung
Zusammenfassung:	The mainstream direction in face alignment is now dominated by cascaded regression methods. These methods start from an image with an initial shape and build a set of shape increments based on features with respect to the current estimated shape. These shape increments move the initial shape to the desired location. Despite the advantages of the cascaded methods, they all share two major limitations: (i) shape increments are learned independently from each other in a cascaded manner, (ii) the use of standard generic computer vision features such SIFT, HOG, does not allow these methods to learn problem-specific features. In this work, we propose a novel Recurrent Convolutional Shape Regression (RCSR) method that overcomes these limitations. We formulate the standard cascaded alignment problem as a recurrent process and learn all shape increments jointly, by using a recurrent neural network with a gated recurrent unit. Importantly, by combining a convolutional neural network with a recurrent one we avoid hand-crafted features, widely adopted in the literature and thus we allow the model to learn task-specific features. Besides, we employ the convolutional gated recurrent unit which takes as input the feature tensors instead of flattened feature vectors. Therefore, the spatial structure of the features can be better preserved in the memory of the recurrent neural network. Moreover, both the convolutional and the recurrent neural networks are learned jointly. Experimental evaluation shows that the proposed method has better performance than the state-of-the-art methods, and further supports the importance of learning a single end-to-end model for face alignment
Beschreibung:	Date Revised 20.11.2019 published: Print-Electronic Citation Status PubMed-not-MEDLINE
ISSN:	1939-3539
DOI:	10.1109/TPAMI.2018.2810881