A Comprehensive Analysis of Deep Regression

Deep learning revolutionized data science, and recently its popularity has grown exponentially, as did the amount of papers employing deep networks. Vision tasks, such as human pose estimation, did not escape from this trend. There is a large number of deep models, where small changes in the network...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 42(2020), 9 vom: 16. Sept., Seite 2065-2081
1. Verfasser: Lathuiliere, Stephane (VerfasserIn)
Weitere Verfasser: Mesejo, Pablo, Alameda-Pineda, Xavier, Horaud, Radu
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2020
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't
Beschreibung
Zusammenfassung:Deep learning revolutionized data science, and recently its popularity has grown exponentially, as did the amount of papers employing deep networks. Vision tasks, such as human pose estimation, did not escape from this trend. There is a large number of deep models, where small changes in the network architecture, or in the data pre-processing, together with the stochastic nature of the optimization procedures, produce notably different results, making extremely difficult to sift methods that significantly outperform others. This situation motivates the current study, in which we perform a systematic evaluation and statistical analysis of vanilla deep regression, i.e., convolutional neural networks with a linear regression top layer. This is the first comprehensive analysis of deep regression techniques. We perform experiments on four vision problems, and report confidence intervals for the median performance as well as the statistical significance of the results, if any. Surprisingly, the variability due to different data pre-processing procedures generally eclipses the variability due to modifications in the network architecture. Our results reinforce the hypothesis according to which, in general, a general-purpose network (e.g., VGG-16 or ResNet-50) adequately tuned can yield results close to the state-of-the-art without having to resort to more complex and ad-hoc regression models
Beschreibung:Date Completed 16.02.2021
Date Revised 16.02.2021
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1939-3539
DOI:10.1109/TPAMI.2019.2910523