Synthesizing statistical knowledge from incomplete mixed-mode data

The difficulties in analyzing and clustering (synthesizing) multivariate data of the mixed type (discrete and continuous) are largely due to: 1) nonuniform scaling in different coordinates, 2) the lack of order in nominal data, and 3) the lack of a suitable similarity measure. This paper presents a...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 9(1987), 6 vom: 01. Juni, Seite 796-805
1. Verfasser:	Wong, A K (VerfasserIn)
Weitere Verfasser:	Chiu, D K
Format:	Aufsatz
Sprache:	English
Veröffentlicht:	1987
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM211016993
003	DE-627
005	20231224012723.0
007	tu
008	231224s1987 xx \|\|\|\|\| 00\| \|\|eng c
028	5	2	\|a pubmed24n0703.xml
035			\|a (DE-627)NLM211016993
035			\|a (NLM)21869441
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Wong, A K \|e verfasserin \|4 aut
245	1	0	\|a Synthesizing statistical knowledge from incomplete mixed-mode data
264		1	\|c 1987
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
500			\|a Date Completed 02.10.2012
500			\|a Date Revised 18.03.2022
500			\|a published: Print
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a The difficulties in analyzing and clustering (synthesizing) multivariate data of the mixed type (discrete and continuous) are largely due to: 1) nonuniform scaling in different coordinates, 2) the lack of order in nominal data, and 3) the lack of a suitable similarity measure. This paper presents a new approach which bypasses these difficulties and can acquire statistical knowledge from incomplete mixed-mode data. The proposed method adopts an event-covering approach which covers a subset of statistically relevant outcomes in the outcome space of variable-pairs. And once the covered event patterns are acquired, subsequent analysis tasks such as probabilistic inference, cluster analysis, and detection of event patterns for each cluster based on the incomplete probability scheme can be performed. There are four phases in our method: 1) the discretization of the continuous components based on a maximum entropy criterion so that the data can be treated as n-tuples of discrete-valued features; 2) the estimation of the missing values using our newly developed inference procedure; 3) the initial formation of clusters by analyzing the nearest-neighbor distance on subsets of selected samples; and 4) the reclassification of the n-tuples into more reliable clusters based on the detected interdependence relationships. For performance evaluation, experiments have been conducted using both simulated and real life data
650		4	\|a Journal Article
700	1		\|a Chiu, D K \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on pattern analysis and machine intelligence \|d 1979 \|g 9(1987), 6 vom: 01. Juni, Seite 796-805 \|w (DE-627)NLM098212257 \|x 1939-3539 \|7 nnns
773	1	8	\|g volume:9 \|g year:1987 \|g number:6 \|g day:01 \|g month:06 \|g pages:796-805
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 9 \|j 1987 \|e 6 \|b 01 \|c 06 \|h 796-805