Effects of data anonymization by cell suppression on descriptive statistics and predictive modeling performance

Protecting individual data in disclosed databases is essential. Data anonymization strategies can produce table ambiguation by suppression of selected cells. Using table ambiguation, different degrees of anonymization can be achieved, depending on the number of individuals that a particular case mus...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:Proceedings. AMIA Symposium. - 1998. - (2001) vom: 11., Seite 503-7
1. Verfasser: Ohno-Machado, L (VerfasserIn)
Weitere Verfasser: Vinterbo, S A, Dreiseitl, S
Format: Aufsatz
Sprache:English
Veröffentlicht: 2001
Zugriff auf das übergeordnete Werk:Proceedings. AMIA Symposium
Schlagworte:Journal Article Research Support, U.S. Gov't, P.H.S.
LEADER 01000naa a22002652 4500
001 NLM117123471
003 DE-627
005 20231222175831.0
007 tu
008 231222s2001 xx ||||| 00| ||eng c
028 5 2 |a pubmed24n0391.xml 
035 |a (DE-627)NLM117123471 
035 |a (NLM)11825239 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Ohno-Machado, L  |e verfasserin  |4 aut 
245 1 0 |a Effects of data anonymization by cell suppression on descriptive statistics and predictive modeling performance 
264 1 |c 2001 
336 |a Text  |b txt  |2 rdacontent 
337 |a ohne Hilfsmittel zu benutzen  |b n  |2 rdamedia 
338 |a Band  |b nc  |2 rdacarrier 
500 |a Date Completed 24.05.2002 
500 |a Date Revised 13.11.2018 
500 |a published: Print 
500 |a Citation Status MEDLINE 
520 |a Protecting individual data in disclosed databases is essential. Data anonymization strategies can produce table ambiguation by suppression of selected cells. Using table ambiguation, different degrees of anonymization can be achieved, depending on the number of individuals that a particular case must become indistinguishable from. This number defines the level of anonymization. Anonymization by cell suppression does not necessarily prevent inferences from being made from the disclosed data. Preventing inferences may be important to preserve confidentiality. We show that anonymized data sets can preserve descriptive characteristics of the data, but might also be used for making inferences on particular individuals, which is a feature that may not be desirable. The degradation of predictive performance is directly proportional to the degree of anonymity. As an example, we report the effect of anonymization on the predictive performance of a model constructed to estimate the probability of disease given clinical findings 
650 4 |a Journal Article 
650 4 |a Research Support, U.S. Gov't, P.H.S. 
700 1 |a Vinterbo, S A  |e verfasserin  |4 aut 
700 1 |a Dreiseitl, S  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t Proceedings. AMIA Symposium  |d 1998  |g (2001) vom: 11., Seite 503-7  |w (DE-627)NLM098642928  |x 1531-605X  |7 nnns 
773 1 8 |g year:2001  |g day:11  |g pages:503-7 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |j 2001  |b 11  |h 503-7