Non-Graph Data Clustering via O(n) Bipartite Graph Convolution

Since the representative capacity of graph-based clustering methods is usually limited by the graph constructed on the original features, it is attractive to find whether graph neural networks (GNNs), a strong extension of neural networks to graphs, can be applied to augment the capacity of graph-ba...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 7 vom: 07. Juli, Seite 8729-8742
1. Verfasser: Zhang, Hongyuan (VerfasserIn)
Weitere Verfasser: Shi, Jiankun, Zhang, Rui, Li, Xuelong
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM355202794
003 DE-627
005 20231226063840.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2022.3231470  |2 doi 
028 5 2 |a pubmed24n1183.xml 
035 |a (DE-627)NLM355202794 
035 |a (NLM)37015533 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Zhang, Hongyuan  |e verfasserin  |4 aut 
245 1 0 |a Non-Graph Data Clustering via O(n) Bipartite Graph Convolution 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 06.06.2023 
500 |a Date Revised 06.06.2023 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Since the representative capacity of graph-based clustering methods is usually limited by the graph constructed on the original features, it is attractive to find whether graph neural networks (GNNs), a strong extension of neural networks to graphs, can be applied to augment the capacity of graph-based clustering methods. The core problems mainly come from two aspects. On the one hand, the graph is unavailable in the most general clustering scenes so that how to construct graph on the non-graph data and the quality of graph is usually the most important part. On the other hand, given n samples, the graph-based clustering methods usually consume at least O(n2) time to build graphs and the graph convolution requires nearly O(n2) for a dense graph and O(|E|) for a sparse one with |E| edges. Accordingly, both graph-based clustering and GNNs suffer from the severe inefficiency problem. To tackle these problems, we propose a novel clustering method, AnchorGAE, with the self-supervised estimation of graph and efficient graph convolution. We first show how to convert a non-graph dataset into a graph dataset, by introducing the generative graph model and anchors. A bipartite graph is built via generating anchors and estimating the connectivity distributions of original points and anchors. We then show that the constructed bipartite graph can reduce the computational complexity of graph convolution from O(n2) and O(|E|) to O(n). The succeeding steps for clustering can be easily designed as O(n) operations. Interestingly, the anchors naturally lead to siamese architecture with the help of the Markov process. Furthermore, the estimated bipartite graph is updated dynamically according to the features extracted by GNN modules, to promote the quality of the graph by exploiting the high-level information by GNNs. However, we theoretically prove that the self-supervised paradigm frequently results in a collapse that often occurs after 2-3 update iterations in experiments, especially when the model is well-trained. A specific strategy is accordingly designed to prevent the collapse. The experiments support the theoretical analysis and show the superiority of AnchorGAE 
650 4 |a Journal Article 
700 1 |a Shi, Jiankun  |e verfasserin  |4 aut 
700 1 |a Zhang, Rui  |e verfasserin  |4 aut 
700 1 |a Li, Xuelong  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 45(2023), 7 vom: 07. Juli, Seite 8729-8742  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:45  |g year:2023  |g number:7  |g day:07  |g month:07  |g pages:8729-8742 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2022.3231470  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 45  |j 2023  |e 7  |b 07  |c 07  |h 8729-8742