Efficient Layer Compression Without Pruning

Network pruning is one of the chief means for improving the computational efficiency of Deep Neural Networks (DNNs). Pruning-based methods generally discard network kernels, channels, or layers, which however inevitably will disrupt original well-learned network correlation and thus lead to performa...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 24., Seite 4689-4700
1. Verfasser: Wu, Jie (VerfasserIn)
Weitere Verfasser: Zhu, Dingshun, Fang, Leyuan, Deng, Yue, Zhong, Zhun
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM360612059
003 DE-627
005 20231226083358.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2023.3302519  |2 doi 
028 5 2 |a pubmed24n1201.xml 
035 |a (DE-627)NLM360612059 
035 |a (NLM)37561618 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wu, Jie  |e verfasserin  |4 aut 
245 1 0 |a Efficient Layer Compression Without Pruning 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 17.08.2023 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Network pruning is one of the chief means for improving the computational efficiency of Deep Neural Networks (DNNs). Pruning-based methods generally discard network kernels, channels, or layers, which however inevitably will disrupt original well-learned network correlation and thus lead to performance degeneration. In this work, we propose an Efficient Layer Compression (ELC) approach to efficiently compress serial layers by decoupling and merging rather than pruning. Specifically, we first propose a novel decoupling module to decouple the layers, enabling us readily merge serial layers that include both nonlinear and convolutional layers. Then, the decoupled network is losslessly merged based on the equivalent conversion of the parameters. In this way, our ELC can effectively reduce the depth of the network without destroying the correlation of the convolutional layers. To our best knowledge, we are the first to exploit the mergeability of serial convolutional layers for lossless network layer compression. Experimental results conducted on two datasets demonstrate that our method retains superior performance with a FLOPs reduction of 74.1% for VGG-16 and 54.6% for ResNet-56, respectively. In addition, our ELC improves the inference speed by 2× on Jetson AGX Xavier edge device 
650 4 |a Journal Article 
700 1 |a Zhu, Dingshun  |e verfasserin  |4 aut 
700 1 |a Fang, Leyuan  |e verfasserin  |4 aut 
700 1 |a Deng, Yue  |e verfasserin  |4 aut 
700 1 |a Zhong, Zhun  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 32(2023) vom: 24., Seite 4689-4700  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnns 
773 1 8 |g volume:32  |g year:2023  |g day:24  |g pages:4689-4700 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2023.3302519  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 32  |j 2023  |b 24  |h 4689-4700