Fine-Grained Multilevel Fusion for Anti-Occlusion Monocular 3D Object Detection

We propose a deep fine-grained multi-level fusion architecture for monocular 3D object detection, with an additionally designed anti-occlusion optimization process. Conventional monocular 3D object detection methods usually leverage geometry constraints such as keypoints, object shape relationships,...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 10., Seite 4050-4061
Auteur principal: Liu, He (Auteur)
Autres auteurs: Liu, Huaping, Wang, Yikai, Sun, Fuchun, Huang, Wenbing
Format: Article en ligne
Langue:English
Publié: 2022
Accès à la collection:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:Journal Article
LEADER 01000caa a22002652c 4500
001 NLM342011111
003 DE-627
005 20250303110118.0
007 cr uuu---uuuuu
008 231226s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2022.3180210  |2 doi 
028 5 2 |a pubmed25n1139.xml 
035 |a (DE-627)NLM342011111 
035 |a (NLM)35679375 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Liu, He  |e verfasserin  |4 aut 
245 1 0 |a Fine-Grained Multilevel Fusion for Anti-Occlusion Monocular 3D Object Detection 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 15.06.2022 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a We propose a deep fine-grained multi-level fusion architecture for monocular 3D object detection, with an additionally designed anti-occlusion optimization process. Conventional monocular 3D object detection methods usually leverage geometry constraints such as keypoints, object shape relationships, and 3D to 2D optimizations to offset the lack of accurate depth information. However, these methods still struggle against directly extracting rich information for fusion from the depth estimation. To solve the problem, we integrate the monocular 3D features with the pseudo-LiDAR filter generation network between fine-grained multi-level layers. Our network utilizes the inherent multi-scale and promotes depth and semantic information flow in different stages. The new architecture can obtain features that incorporate more reliable depth information. At the same time, the problem of occlusion among objects is prevalent in natural scenes yet remains unsolved mainly. We propose a novel loss function that aims at alleviating the problem of occlusion. Extensive experiments have proved that the framework demonstrates a competitive performance, especially for the complex scenes with occlusion 
650 4 |a Journal Article 
700 1 |a Liu, Huaping  |e verfasserin  |4 aut 
700 1 |a Wang, Yikai  |e verfasserin  |4 aut 
700 1 |a Sun, Fuchun  |e verfasserin  |4 aut 
700 1 |a Huang, Wenbing  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 31(2022) vom: 10., Seite 4050-4061  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnas 
773 1 8 |g volume:31  |g year:2022  |g day:10  |g pages:4050-4061 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2022.3180210  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 31  |j 2022  |b 10  |h 4050-4061