Fine-Grained Multilevel Fusion for Anti-Occlusion Monocular 3D Object Detection

We propose a deep fine-grained multi-level fusion architecture for monocular 3D object detection, with an additionally designed anti-occlusion optimization process. Conventional monocular 3D object detection methods usually leverage geometry constraints such as keypoints, object shape relationships,...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 10., Seite 4050-4061
Auteur principal:	Liu, He (Auteur)
Autres auteurs:	Liu, Huaping, Wang, Yikai, Sun, Fuchun, Huang, Wenbing
Format:	Article en ligne
Langue:	English
Publié:	2022
Accès à la collection:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:	Journal Article


LEADER	01000caa a22002652c 4500
001	NLM342011111
003	DE-627
005	20250303110118.0
007	cr uuu---uuuuu
008	231226s2022 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2022.3180210 \|2 doi
028	5	2	\|a pubmed25n1139.xml
035			\|a (DE-627)NLM342011111
035			\|a (NLM)35679375
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Liu, He \|e verfasserin \|4 aut
245	1	0	\|a Fine-Grained Multilevel Fusion for Anti-Occlusion Monocular 3D Object Detection
264		1	\|c 2022
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 15.06.2022
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a We propose a deep fine-grained multi-level fusion architecture for monocular 3D object detection, with an additionally designed anti-occlusion optimization process. Conventional monocular 3D object detection methods usually leverage geometry constraints such as keypoints, object shape relationships, and 3D to 2D optimizations to offset the lack of accurate depth information. However, these methods still struggle against directly extracting rich information for fusion from the depth estimation. To solve the problem, we integrate the monocular 3D features with the pseudo-LiDAR filter generation network between fine-grained multi-level layers. Our network utilizes the inherent multi-scale and promotes depth and semantic information flow in different stages. The new architecture can obtain features that incorporate more reliable depth information. At the same time, the problem of occlusion among objects is prevalent in natural scenes yet remains unsolved mainly. We propose a novel loss function that aims at alleviating the problem of occlusion. Extensive experiments have proved that the framework demonstrates a competitive performance, especially for the complex scenes with occlusion
650		4	\|a Journal Article
700	1		\|a Liu, Huaping \|e verfasserin \|4 aut
700	1		\|a Wang, Yikai \|e verfasserin \|4 aut
700	1		\|a Sun, Fuchun \|e verfasserin \|4 aut
700	1		\|a Huang, Wenbing \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 31(2022) vom: 10., Seite 4050-4061 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnas
773	1	8	\|g volume:31 \|g year:2022 \|g day:10 \|g pages:4050-4061
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2022.3180210 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 31 \|j 2022 \|b 10 \|h 4050-4061