Classes are not Clusters : Improving Label-based Evaluation of Dimensionality Reduction

A common way to evaluate the reliability of dimensionality reduction (DR) embeddings is to quantify how well labeled classes form compact, mutually separated clusters in the embeddings. This approach is based on the assumption that the classes stay as clear clusters in the original high-dimensional...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics. - 1996. - PP(2023) vom: 03. Nov.
1. Verfasser: Jeon, Hyeon (VerfasserIn)
Weitere Verfasser: Kuo, Yun-Hsin, Aupetit, Michael, Ma, Kwan-Liu, Seo, Jinwook
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on visualization and computer graphics
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM364140291
003 DE-627
005 20231226094820.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TVCG.2023.3327187  |2 doi 
028 5 2 |a pubmed24n1213.xml 
035 |a (DE-627)NLM364140291 
035 |a (NLM)37922177 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Jeon, Hyeon  |e verfasserin  |4 aut 
245 1 0 |a Classes are not Clusters  |b Improving Label-based Evaluation of Dimensionality Reduction 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 01.12.2023 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a A common way to evaluate the reliability of dimensionality reduction (DR) embeddings is to quantify how well labeled classes form compact, mutually separated clusters in the embeddings. This approach is based on the assumption that the classes stay as clear clusters in the original high-dimensional space. However, in reality, this assumption can be violated; a single class can be fragmented into multiple separated clusters, and multiple classes can be merged into a single cluster. We thus cannot always assure the credibility of the evaluation using class labels. In this paper, we introduce two novel quality measures-Label-Trustworthiness and Label-Continuity (Label-T&C)-advancing the process of DR evaluation based on class labels. Instead of assuming that classes are well-clustered in the original space, Label-T&C work by (1) estimating the extent to which classes form clusters in the original and embedded spaces and (2) evaluating the difference between the two. A quantitative evaluation showed that Label-T&C outperform widely used DR evaluation measures (e.g., Trustworthiness and Continuity, Kullback-Leibler divergence) in terms of the accuracy in assessing how well DR embeddings preserve the cluster structure, and are also scalable. Moreover, we present case studies demonstrating that Label-T&C can be successfully used for revealing the intrinsic characteristics of DR techniques and their hyperparameters 
650 4 |a Journal Article 
700 1 |a Kuo, Yun-Hsin  |e verfasserin  |4 aut 
700 1 |a Aupetit, Michael  |e verfasserin  |4 aut 
700 1 |a Ma, Kwan-Liu  |e verfasserin  |4 aut 
700 1 |a Seo, Jinwook  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on visualization and computer graphics  |d 1996  |g PP(2023) vom: 03. Nov.  |w (DE-627)NLM098269445  |x 1941-0506  |7 nnns 
773 1 8 |g volume:PP  |g year:2023  |g day:03  |g month:11 
856 4 0 |u http://dx.doi.org/10.1109/TVCG.2023.3327187  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d PP  |j 2023  |b 03  |c 11