3DCOMPAT++ : An Improved Large-scale 3D Vision Dataset for Compositional Recognition

In this work, we present 3DCOMPAT++, a multimodal 2D/3D dataset with 160 million rendered views of more than 10 million stylized 3D shapes carefully annotated at the partinstance level, alongside matching RGB point clouds, 3D textured meshes, depth maps, and segmentation masks. 3DCOMPAT ++ covers 42...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2025) vom: 11. Aug.
1. Verfasser: Slim, Habib (VerfasserIn)
Weitere Verfasser: Li, Xiang, Li, Yuchen, Ahmed, Mahmoud, Ayman, Mohamed, Upadhyay, Ujjwal, Abdelreheem, Ahmed, Prajapati, Arpit, Pothigara, Suhail, Wonka, Peter, Elhoseiny, Mohamed
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2025
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652c 4500
001 NLM390990779
003 DE-627
005 20250812232342.0
007 cr uuu---uuuuu
008 250812s2025 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2025.3597476  |2 doi 
028 5 2 |a pubmed25n1528.xml 
035 |a (DE-627)NLM390990779 
035 |a (NLM)40788793 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Slim, Habib  |e verfasserin  |4 aut 
245 1 0 |a 3DCOMPAT++  |b An Improved Large-scale 3D Vision Dataset for Compositional Recognition 
264 1 |c 2025 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 11.08.2025 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a In this work, we present 3DCOMPAT++, a multimodal 2D/3D dataset with 160 million rendered views of more than 10 million stylized 3D shapes carefully annotated at the partinstance level, alongside matching RGB point clouds, 3D textured meshes, depth maps, and segmentation masks. 3DCOMPAT ++ covers 42 shape categories, 275 fine-grained part categories, and 293 fine-grained material classes that can be compositionally applied to parts of 3D objects. We render a subset of one million stylized shapes from four equally spaced views as well as four randomized views, leading to a total of 160 million renderings. Parts are segmented at the instance level, with coarse-grained and fine-grained semantic levels. We introduce a new task, called Grounded CoMPaT Recognition (GCR), to collectively recognize and ground compositions of materials on parts of 3D objects. Additionally, we report the outcomes of a data challenge organized at the CVPR conference, showcasing the winning method's utilization of a modified PointNet++ model trained on 6D inputs, and exploring alternative techniques for GCR enhancement. We hope our work will help ease future research on compositional 3D Vision. The dataset and code have been made publicly available at https://3dcompat-dataset.org/v2/. 3D vision, dataset, 3D modeling, multimodal learning, compositional learning 
650 4 |a Journal Article 
700 1 |a Li, Xiang  |e verfasserin  |4 aut 
700 1 |a Li, Yuchen  |e verfasserin  |4 aut 
700 1 |a Ahmed, Mahmoud  |e verfasserin  |4 aut 
700 1 |a Ayman, Mohamed  |e verfasserin  |4 aut 
700 1 |a Upadhyay, Ujjwal  |e verfasserin  |4 aut 
700 1 |a Abdelreheem, Ahmed  |e verfasserin  |4 aut 
700 1 |a Prajapati, Arpit  |e verfasserin  |4 aut 
700 1 |a Pothigara, Suhail  |e verfasserin  |4 aut 
700 1 |a Wonka, Peter  |e verfasserin  |4 aut 
700 1 |a Elhoseiny, Mohamed  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g PP(2025) vom: 11. Aug.  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnas 
773 1 8 |g volume:PP  |g year:2025  |g day:11  |g month:08 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2025.3597476  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d PP  |j 2025  |b 11  |c 08