Variational HyperAdam : A Meta-Learning Approach to Network Training

Stochastic optimization algorithms have been popular for training deep neural networks. Recently, there emerges a new approach of learning-based optimizer, which has achieved promising performance for training neural networks. However, these black-box learning-based optimizers do not fully take adva...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 8 vom: 23. Aug., Seite 4469-4484
1. Verfasser: Wang, Shipeng (VerfasserIn)
Weitere Verfasser: Yang, Yan, Sun, Jian, Xu, Zongben
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM321795210
003 DE-627
005 20231225180641.0
007 cr uuu---uuuuu
008 231225s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2021.3061581  |2 doi 
028 5 2 |a pubmed24n1072.xml 
035 |a (DE-627)NLM321795210 
035 |a (NLM)33621172 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wang, Shipeng  |e verfasserin  |4 aut 
245 1 0 |a Variational HyperAdam  |b A Meta-Learning Approach to Network Training 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 06.07.2022 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Stochastic optimization algorithms have been popular for training deep neural networks. Recently, there emerges a new approach of learning-based optimizer, which has achieved promising performance for training neural networks. However, these black-box learning-based optimizers do not fully take advantage of the experience in human-designed optimizers and heavily rely on learning from meta-training tasks, therefore have limited generalization ability. In this paper, we propose a novel optimizer, dubbed as Variational HyperAdam, which is based on a parametric generalized Adam algorithm, i.e., HyperAdam, in a variational framework. With Variational HyperAdam as optimizer for training neural network, the parameter update vector of the neural network at each training step is considered as random variable, whose approximate posterior distribution given the training data and current network parameter vector is predicted by Variational HyperAdam. The parameter update vector for network training is sampled from this approximate posterior distribution. Specifically, in Variational HyperAdam, we design a learnable generalized Adam algorithm for estimating expectation, paired with a VarBlock for estimating the variance of the approximate posterior distribution of parameter update vector. The Variational HyperAdam is learned in a meta-learning approach with meta-training loss derived by variational inference. Experiments verify that the learned Variational HyperAdam achieved state-of-the-art network training performance for various types of networks on different datasets, such as multilayer perceptron, CNN, LSTM and ResNet 
650 4 |a Journal Article 
700 1 |a Yang, Yan  |e verfasserin  |4 aut 
700 1 |a Sun, Jian  |e verfasserin  |4 aut 
700 1 |a Xu, Zongben  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 44(2022), 8 vom: 23. Aug., Seite 4469-4484  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:44  |g year:2022  |g number:8  |g day:23  |g month:08  |g pages:4469-4484 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2021.3061581  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 44  |j 2022  |e 8  |b 23  |c 08  |h 4469-4484