A Reconfigurable Tangram Model for Scene Representation and Categorization
This paper presents a hierarchical and compositional scene layout (i.e., spatial configuration) representation and a method of learning reconfigurable model for scene categorization. Three types of shape primitives (i.e., triangle, parallelogram, and trapezoid), called tans, are used to tile scene i...
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 25(2016), 1 vom: 12. Jan., Seite 150-66 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2016
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Schlagworte: | Journal Article Research Support, Non-U.S. Gov't |
Zusammenfassung: | This paper presents a hierarchical and compositional scene layout (i.e., spatial configuration) representation and a method of learning reconfigurable model for scene categorization. Three types of shape primitives (i.e., triangle, parallelogram, and trapezoid), called tans, are used to tile scene image lattice in a hierarchical and compositional way, and a directed acyclic AND-OR graph (AOG) is proposed to organize the overcomplete dictionary of tan instances placed in image lattice, exploring a very large number of scene layouts. With certain off-the-shelf appearance features used for grounding terminal-nodes (i.e., tan instances) in the AOG, a scene layout is represented by the globally optimal parse tree learned via a dynamic programming algorithm from the AOG, which we call tangram model. Then, a scene category is represented by a mixture of tangram models discovered with an exemplar-based clustering method. On basis of the tangram model, we address scene categorization in two aspects: 1) building a tangram bank representation for linear classifiers, which utilizes a collection of tangram models learned from all categories and 2) building a tangram matching kernel for kernel-based classification, which accounts for all hidden spatial configurations in the AOG. In experiments, our methods are evaluated on three scene data sets for both the configuration-level and semantic-level scene categorization, and outperform the spatial pyramid model consistently |
---|---|
Beschreibung: | Date Completed 18.03.2016 Date Revised 11.03.2016 published: Print-Electronic Citation Status PubMed-not-MEDLINE |
ISSN: | 1941-0042 |
DOI: | 10.1109/TIP.2015.2498407 |