Efficient Visual Computing With Camera RAW Snapshots

Conventional cameras capture image irradiance (RAW) on a sensor and convert it to RGB images using an image signal processor (ISP). The images can then be used for photography or visual computing tasks in a variety of applications, such as public safety surveillance and autonomous driving. One can a...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 7 vom: 01. Juni, Seite 4684-4701
1. Verfasser: Li, Zhihao (VerfasserIn)
Weitere Verfasser: Lu, Ming, Zhang, Xu, Feng, Xin, Asif, M Salman, Ma, Zhan
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM367760045
003 DE-627
005 20240606232334.0
007 cr uuu---uuuuu
008 240130s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2024.3359326  |2 doi 
028 5 2 |a pubmed24n1430.xml 
035 |a (DE-627)NLM367760045 
035 |a (NLM)38285590 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Li, Zhihao  |e verfasserin  |4 aut 
245 1 0 |a Efficient Visual Computing With Camera RAW Snapshots 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 06.06.2024 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Conventional cameras capture image irradiance (RAW) on a sensor and convert it to RGB images using an image signal processor (ISP). The images can then be used for photography or visual computing tasks in a variety of applications, such as public safety surveillance and autonomous driving. One can argue that since RAW images contain all the captured information, the conversion of RAW to RGB using an ISP is not necessary for visual computing. In this paper, we propose a novel ρ-Vision framework to perform high-level semantic understanding and low-level compression using RAW images without the ISP subsystem used for decades. Considering the scarcity of available RAW image datasets, we first develop an unpaired CycleR2R network based on unsupervised CycleGAN to train modular unrolled ISP and inverse ISP (invISP) models using unpaired RAW and RGB images. We can then flexibly generate simulated RAW images (simRAW) using any existing RGB image dataset and finetune different models originally trained in the RGB domain to process real-world camera RAW images. We demonstrate object detection and image compression capabilities in RAW-domain using RAW-domain YOLOv3 and RAW image compressor (RIC) on camera snapshots. Quantitative results reveal that RAW-domain task inference provides better detection accuracy and compression efficiency compared to that in the RGB domain. Furthermore, the proposed ρ-Vision generalizes across various camera sensors and different task-specific models. An added benefit of employing the ρ-Vision is the elimination of the need for ISP, leading to potential reductions in computations and processing times 
650 4 |a Journal Article 
700 1 |a Lu, Ming  |e verfasserin  |4 aut 
700 1 |a Zhang, Xu  |e verfasserin  |4 aut 
700 1 |a Feng, Xin  |e verfasserin  |4 aut 
700 1 |a Asif, M Salman  |e verfasserin  |4 aut 
700 1 |a Ma, Zhan  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 46(2024), 7 vom: 01. Juni, Seite 4684-4701  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:46  |g year:2024  |g number:7  |g day:01  |g month:06  |g pages:4684-4701 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2024.3359326  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 46  |j 2024  |e 7  |b 01  |c 06  |h 4684-4701