Diffusion maps and coarse-graining : A unified framework for dimensionality reduction, graph partitioning, and data set parameterization

We provide evidence that nonlinear dimensionality reduction, clustering, and data set parameterization can be solved within one and the same framework. The main idea is to define a system of coordinates with an explicit metric that reflects the connectivity of a given data set and that is robust to...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 28(2006), 9 vom: 23. Sept., Seite 1393-403
1. Verfasser: Lafon, Stéphane (VerfasserIn)
Weitere Verfasser: Lee, Ann B
Format: Aufsatz
Sprache:English
Veröffentlicht: 2006
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM164973338
003 DE-627
005 20231223103713.0
007 tu
008 231223s2006 xx ||||| 00| ||eng c
028 5 2 |a pubmed24n0550.xml 
035 |a (DE-627)NLM164973338 
035 |a (NLM)16929727 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Lafon, Stéphane  |e verfasserin  |4 aut 
245 1 0 |a Diffusion maps and coarse-graining  |b A unified framework for dimensionality reduction, graph partitioning, and data set parameterization 
264 1 |c 2006 
336 |a Text  |b txt  |2 rdacontent 
337 |a ohne Hilfsmittel zu benutzen  |b n  |2 rdamedia 
338 |a Band  |b nc  |2 rdacarrier 
500 |a Date Completed 20.09.2006 
500 |a Date Revised 25.08.2006 
500 |a published: Print 
500 |a Citation Status MEDLINE 
520 |a We provide evidence that nonlinear dimensionality reduction, clustering, and data set parameterization can be solved within one and the same framework. The main idea is to define a system of coordinates with an explicit metric that reflects the connectivity of a given data set and that is robust to noise. Our construction, which is based on a Markov random walk on the data, offers a general scheme of simultaneously reorganizing and subsampling graphs and arbitrarily shaped data sets in high dimensions using intrinsic geometry. We show that clustering in embedding spaces is equivalent to compressing operators. The objective of data partitioning and clustering is to coarse-grain the random walk on the data while at the same time preserving a diffusion operator for the intrinsic geometry or connectivity of the data set up to some accuracy. We show that the quantization distortion in diffusion space bounds the error of compression of the operator, thus giving a rigorous justification for k-means clustering in diffusion space and a precise measure of the performance of general clustering algorithms 
650 4 |a Journal Article 
700 1 |a Lee, Ann B  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 28(2006), 9 vom: 23. Sept., Seite 1393-403  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:28  |g year:2006  |g number:9  |g day:23  |g month:09  |g pages:1393-403 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 28  |j 2006  |e 9  |b 23  |c 09  |h 1393-403