Recursive Nearest Agglomeration (ReNA) : Fast Clustering for Approximation of Structured Signals

In this work, we revisit fast dimension reduction approaches, as with random projections and random sampling. Our goal is to summarize the data to decrease computational costs and memory footprint of subsequent analysis. Such dimension reduction can be very efficient when the signals of interest hav...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 41(2019), 3 vom: 29. März, Seite 669-681
1. Verfasser: Hoyos-Idrobo, Andres (VerfasserIn)
Weitere Verfasser: Varoquaux, Gael, Kahn, Jonas, Thirion, Bertrand
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2019
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM286364042
003 DE-627
005 20231225051533.0
007 cr uuu---uuuuu
008 231225s2019 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2018.2815524  |2 doi 
028 5 2 |a pubmed24n0954.xml 
035 |a (DE-627)NLM286364042 
035 |a (NLM)29993861 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Hoyos-Idrobo, Andres  |e verfasserin  |4 aut 
245 1 0 |a Recursive Nearest Agglomeration (ReNA)  |b Fast Clustering for Approximation of Structured Signals 
264 1 |c 2019 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 20.11.2019 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a In this work, we revisit fast dimension reduction approaches, as with random projections and random sampling. Our goal is to summarize the data to decrease computational costs and memory footprint of subsequent analysis. Such dimension reduction can be very efficient when the signals of interest have a strong structure, such as with images. We focus on this setting and investigate feature clustering schemes for data reductions that capture this structure. An impediment to fast dimension reduction is then that good clustering comes with large algorithmic costs. We address it by contributing a linear-time agglomerative clustering scheme, Recursive Nearest Agglomeration (ReNA). Unlike existing fast agglomerative schemes, it avoids the creation of giant clusters. We empirically validate that it approximates the data as well as traditional variance-minimizing clustering schemes that have a quadratic complexity. In addition, we analyze signal approximation with feature clustering and show that it can remove noise, improving subsequent analysis steps. As a consequence, data reduction by clustering features with ReNA yields very fast and accurate models, enabling to process large datasets on budget. Our theoretical analysis is backed by extensive experiments on publicly-available data that illustrate the computation efficiency and the denoising properties of the resulting dimension reduction scheme 
650 4 |a Journal Article 
700 1 |a Varoquaux, Gael  |e verfasserin  |4 aut 
700 1 |a Kahn, Jonas  |e verfasserin  |4 aut 
700 1 |a Thirion, Bertrand  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 41(2019), 3 vom: 29. März, Seite 669-681  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:41  |g year:2019  |g number:3  |g day:29  |g month:03  |g pages:669-681 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2018.2815524  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 41  |j 2019  |e 3  |b 29  |c 03  |h 669-681