Lottery Jackpots Exist in Pre-Trained Models

Network pruning is an effective approach to reduce network complexity with acceptable performance compromise. Existing studies achieve the sparsity of neural networks via time-consuming weight training or complex searching on networks with expanded width, which greatly limits the applications of net...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 12 vom: 05. Dez., Seite 14990-15004
1. Verfasser:	Zhang, Yuxin (VerfasserIn)
Weitere Verfasser:	Lin, Mingbao, Zhong, Yunshan, Chao, Fei, Ji, Rongrong
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2023
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM361668961
003	DE-627
005	20231226085629.0
007	cr uuu---uuuuu
008	231226s2023 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TPAMI.2023.3311783 \|2 doi
028	5	2	\|a pubmed24n1205.xml
035			\|a (DE-627)NLM361668961
035			\|a (NLM)37669203
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Zhang, Yuxin \|e verfasserin \|4 aut
245	1	0	\|a Lottery Jackpots Exist in Pre-Trained Models
264		1	\|c 2023
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 07.11.2023
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Network pruning is an effective approach to reduce network complexity with acceptable performance compromise. Existing studies achieve the sparsity of neural networks via time-consuming weight training or complex searching on networks with expanded width, which greatly limits the applications of network pruning. In this paper, we show that high-performing and sparse sub-networks without the involvement of weight training, termed "lottery jackpots", exist in pre-trained models with unexpanded width. Our presented lottery jackpots are traceable through empirical and theoretical outcomes. For example, we obtain a lottery jackpot that has only 10% parameters and still reaches the performance of the original dense VGGNet-19 without any modifications on the pre-trained weights on CIFAR-10. Furthermore, we improve the efficiency for searching lottery jackpots from two perspectives. First, we observe that the sparse masks derived from many existing pruning criteria have a high overlap with the searched mask of our lottery jackpot, among which, the magnitude-based pruning results in the most similar mask with ours. In compliance with this insight, we initialize our sparse mask using the magnitude-based pruning, resulting in at least 3× cost reduction on the lottery jackpot searching while achieving comparable or even better performance. Second, we conduct an in-depth analysis of the searching process for lottery jackpots. Our theoretical result suggests that the decrease in training loss during weight searching can be disturbed by the dependency between weights in modern networks. To mitigate this, we propose a novel short restriction method to restrict change of masks that may have potential negative impacts on the training loss, which leads to a faster convergence and reduced oscillation for searching lottery jackpots. Consequently, our searched lottery jackpot removes 90% weights in ResNet-50, while it easily obtains more than 70% top-1 accuracy using only 5 searching epochs on ImageNet
650		4	\|a Journal Article
700	1		\|a Lin, Mingbao \|e verfasserin \|4 aut
700	1		\|a Zhong, Yunshan \|e verfasserin \|4 aut
700	1		\|a Chao, Fei \|e verfasserin \|4 aut
700	1		\|a Ji, Rongrong \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on pattern analysis and machine intelligence \|d 1979 \|g 45(2023), 12 vom: 05. Dez., Seite 14990-15004 \|w (DE-627)NLM098212257 \|x 1939-3539 \|7 nnns
773	1	8	\|g volume:45 \|g year:2023 \|g number:12 \|g day:05 \|g month:12 \|g pages:14990-15004
856	4	0	\|u http://dx.doi.org/10.1109/TPAMI.2023.3311783 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 45 \|j 2023 \|e 12 \|b 05 \|c 12 \|h 14990-15004