Synthesizing statistical knowledge from incomplete mixed-mode data

The difficulties in analyzing and clustering (synthesizing) multivariate data of the mixed type (discrete and continuous) are largely due to: 1) nonuniform scaling in different coordinates, 2) the lack of order in nominal data, and 3) the lack of a suitable similarity measure. This paper presents a...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 9(1987), 6 vom: 01. Juni, Seite 796-805
1. Verfasser: Wong, A K (VerfasserIn)
Weitere Verfasser: Chiu, D K
Format: Aufsatz
Sprache:English
Veröffentlicht: 1987
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM211016993
003 DE-627
005 20231224012723.0
007 tu
008 231224s1987 xx ||||| 00| ||eng c
028 5 2 |a pubmed24n0703.xml 
035 |a (DE-627)NLM211016993 
035 |a (NLM)21869441 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wong, A K  |e verfasserin  |4 aut 
245 1 0 |a Synthesizing statistical knowledge from incomplete mixed-mode data 
264 1 |c 1987 
336 |a Text  |b txt  |2 rdacontent 
337 |a ohne Hilfsmittel zu benutzen  |b n  |2 rdamedia 
338 |a Band  |b nc  |2 rdacarrier 
500 |a Date Completed 02.10.2012 
500 |a Date Revised 18.03.2022 
500 |a published: Print 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a The difficulties in analyzing and clustering (synthesizing) multivariate data of the mixed type (discrete and continuous) are largely due to: 1) nonuniform scaling in different coordinates, 2) the lack of order in nominal data, and 3) the lack of a suitable similarity measure. This paper presents a new approach which bypasses these difficulties and can acquire statistical knowledge from incomplete mixed-mode data. The proposed method adopts an event-covering approach which covers a subset of statistically relevant outcomes in the outcome space of variable-pairs. And once the covered event patterns are acquired, subsequent analysis tasks such as probabilistic inference, cluster analysis, and detection of event patterns for each cluster based on the incomplete probability scheme can be performed. There are four phases in our method: 1) the discretization of the continuous components based on a maximum entropy criterion so that the data can be treated as n-tuples of discrete-valued features; 2) the estimation of the missing values using our newly developed inference procedure; 3) the initial formation of clusters by analyzing the nearest-neighbor distance on subsets of selected samples; and 4) the reclassification of the n-tuples into more reliable clusters based on the detected interdependence relationships. For performance evaluation, experiments have been conducted using both simulated and real life data 
650 4 |a Journal Article 
700 1 |a Chiu, D K  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 9(1987), 6 vom: 01. Juni, Seite 796-805  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:9  |g year:1987  |g number:6  |g day:01  |g month:06  |g pages:796-805 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 9  |j 1987  |e 6  |b 01  |c 06  |h 796-805