Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present Spatially Sparse Inference (SSI), a general-...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 12 vom: 05. Dez., Seite 14465-14480
1. Verfasser: Li, Muyang (VerfasserIn)
Weitere Verfasser: Lin, Ji, Meng, Chenlin, Ermon, Stefano, Han, Song, Zhu, Jun-Yan
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM362103674
003 DE-627
005 20231226090533.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2023.3316020  |2 doi 
028 5 2 |a pubmed24n1206.xml 
035 |a (DE-627)NLM362103674 
035 |a (NLM)37713217 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Li, Muyang  |e verfasserin  |4 aut 
245 1 0 |a Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 07.11.2023 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present Spatially Sparse Inference (SSI), a general-purpose technique that selectively performs computation for edited regions and accelerates various generative models, including both conditional GANs and diffusion models. Our key observation is that users prone to gradually edit the input image. This motivates us to cache and reuse the feature maps of the original image. Given an edited image, we sparsely apply the convolutional filters to the edited regions while reusing the cached features for the unedited areas. Based on our algorithm, we further propose Sparse Incremental Generative Engine (SIGE) to convert the computation reduction to latency reduction on off-the-shelf hardware. With about 1%-area edits, SIGE accelerates DDPM by 3.0× on NVIDIA RTX 3090 and 4.6× on Apple M1 Pro GPU, Stable Diffusion by 7.2× on 3090, and GauGAN by 5.6× on 3090 and 5.2× on M1 Pro GPU. Compared to our conference paper, we enhance SIGE to accommodate attention layers and apply it to Stable Diffusion. Additionally, we offer support for Apple M1 Pro GPU and include more results to substantiate the efficacy of our method 
650 4 |a Journal Article 
700 1 |a Lin, Ji  |e verfasserin  |4 aut 
700 1 |a Meng, Chenlin  |e verfasserin  |4 aut 
700 1 |a Ermon, Stefano  |e verfasserin  |4 aut 
700 1 |a Han, Song  |e verfasserin  |4 aut 
700 1 |a Zhu, Jun-Yan  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 45(2023), 12 vom: 05. Dez., Seite 14465-14480  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:45  |g year:2023  |g number:12  |g day:05  |g month:12  |g pages:14465-14480 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2023.3316020  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 45  |j 2023  |e 12  |b 05  |c 12  |h 14465-14480