REAF : Remembering Enhancement and Entropy-Based Asymptotic Forgetting for Filter Pruning
Neurologically, filter pruning is a procedure of forgetting and remembering recovering. Prevailing methods directly forget less important information from an unrobust baseline at first and expect to minimize the performance sacrifice. However, unsaturated base remembering imposes a ceiling on the sl...
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 20., Seite 3912-3923 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Schlagworte: | Journal Article |
Zusammenfassung: | Neurologically, filter pruning is a procedure of forgetting and remembering recovering. Prevailing methods directly forget less important information from an unrobust baseline at first and expect to minimize the performance sacrifice. However, unsaturated base remembering imposes a ceiling on the slimmed model leading to suboptimal performance. And significantly forgetting at first would cause unrecoverable information loss. Here, we design a novel filter pruning paradigm termed Remembering Enhancement and Entropy-based Asymptotic Forgetting (REAF). Inspired by robustness theory, we first enhance remembering by over-parameterizing baseline with fusible compensatory convolutions which liberates pruned model from the bondage of baseline at no inference cost. Then the collateral implication between original and compensatory filters necessitates a bilateral-collaborated pruning criterion. Specifically, only when the filter has the largest intra-branch distance and its compensatory counterpart has the strongest remembering enhancement power, they are preserved. Further, Ebbinghaus curve-based asymptotic forgetting is proposed to protect the pruned model from unstable learning. The number of pruned filters is increasing asymptotically in the training procedure, which enables the remembering of pretrained weights gradually to be concentrated in the remaining filters. Extensive experiments demonstrate the superiority of REAF over many state-of-the-art (SOTA) methods. For example, REAF removes 47.55% FLOPs and 42.98% parameters of ResNet-50 only with 0.98% TOP-1 accuracy loss on ImageNet. The code is available at https://github.com/zhangxin-xd/REAF |
---|---|
Beschreibung: | Date Completed 18.07.2023 Date Revised 18.07.2023 published: Print-Electronic Citation Status PubMed-not-MEDLINE |
ISSN: | 1941-0042 |
DOI: | 10.1109/TIP.2023.3288986 |