Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge-it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 5 vom: 13. Mai, Seite 5481-5496
1. Verfasser: Wang, Yikai (VerfasserIn)
Weitere Verfasser: Sun, Fuchun, Huang, Wenbing, He, Fengxiang, Tao, Dacheng
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM346945135
003 DE-627
005 20231226032647.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2022.3211086  |2 doi 
028 5 2 |a pubmed24n1156.xml 
035 |a (DE-627)NLM346945135 
035 |a (NLM)36178992 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wang, Yikai  |e verfasserin  |4 aut 
245 1 0 |a Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 10.04.2023 
500 |a Date Revised 10.04.2023 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge-it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. task). Besides, while they are actually closely related to each other, multimodal fusion and multitask learning are rarely explored within the same methodological framework before. In this paper, we propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for multimodal and multitask dense image prediction. At its core, CEN adaptively exchanges channels between subnetworks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of CEN compared to state-of-the-art methods. Detailed ablation studies have also been carried out, which demonstrate the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN 
650 4 |a Journal Article 
700 1 |a Sun, Fuchun  |e verfasserin  |4 aut 
700 1 |a Huang, Wenbing  |e verfasserin  |4 aut 
700 1 |a He, Fengxiang  |e verfasserin  |4 aut 
700 1 |a Tao, Dacheng  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 45(2023), 5 vom: 13. Mai, Seite 5481-5496  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:45  |g year:2023  |g number:5  |g day:13  |g month:05  |g pages:5481-5496 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2022.3211086  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 45  |j 2023  |e 5  |b 13  |c 05  |h 5481-5496