Fine-Grained Multilevel Fusion for Anti-Occlusion Monocular 3D Object Detection

We propose a deep fine-grained multi-level fusion architecture for monocular 3D object detection, with an additionally designed anti-occlusion optimization process. Conventional monocular 3D object detection methods usually leverage geometry constraints such as keypoints, object shape relationships,...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 10., Seite 4050-4061
1. Verfasser: Liu, He (VerfasserIn)
Weitere Verfasser: Liu, Huaping, Wang, Yikai, Sun, Fuchun, Huang, Wenbing
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM342011111
003 DE-627
005 20231226013106.0
007 cr uuu---uuuuu
008 231226s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2022.3180210  |2 doi 
028 5 2 |a pubmed24n1139.xml 
035 |a (DE-627)NLM342011111 
035 |a (NLM)35679375 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Liu, He  |e verfasserin  |4 aut 
245 1 0 |a Fine-Grained Multilevel Fusion for Anti-Occlusion Monocular 3D Object Detection 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 15.06.2022 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a We propose a deep fine-grained multi-level fusion architecture for monocular 3D object detection, with an additionally designed anti-occlusion optimization process. Conventional monocular 3D object detection methods usually leverage geometry constraints such as keypoints, object shape relationships, and 3D to 2D optimizations to offset the lack of accurate depth information. However, these methods still struggle against directly extracting rich information for fusion from the depth estimation. To solve the problem, we integrate the monocular 3D features with the pseudo-LiDAR filter generation network between fine-grained multi-level layers. Our network utilizes the inherent multi-scale and promotes depth and semantic information flow in different stages. The new architecture can obtain features that incorporate more reliable depth information. At the same time, the problem of occlusion among objects is prevalent in natural scenes yet remains unsolved mainly. We propose a novel loss function that aims at alleviating the problem of occlusion. Extensive experiments have proved that the framework demonstrates a competitive performance, especially for the complex scenes with occlusion 
650 4 |a Journal Article 
700 1 |a Liu, Huaping  |e verfasserin  |4 aut 
700 1 |a Wang, Yikai  |e verfasserin  |4 aut 
700 1 |a Sun, Fuchun  |e verfasserin  |4 aut 
700 1 |a Huang, Wenbing  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 31(2022) vom: 10., Seite 4050-4061  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnns 
773 1 8 |g volume:31  |g year:2022  |g day:10  |g pages:4050-4061 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2022.3180210  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 31  |j 2022  |b 10  |h 4050-4061