Improving machine learning performance by removing redundant cases in medical data sets

Neural network models and other machine learning methods have successfully been applied to several medical classification problems. These models can be periodically refined and retrained as new cases become available. Since training neural networks by backpropagation is time consuming, it is desirab...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:Proceedings. AMIA Symposium. - 1998. - (1998) vom: 13., Seite 523-7
1. Verfasser: Ohno-Machado, L (VerfasserIn)
Weitere Verfasser: Fraser, H S, Ohrn, A
Format: Aufsatz
Sprache:English
Veröffentlicht: 1998
Zugriff auf das übergeordnete Werk:Proceedings. AMIA Symposium
Schlagworte:Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, P.H.S.
LEADER 01000naa a22002652 4500
001 NLM098643924
003 DE-627
005 20231222112457.0
007 tu
008 231222s1998 xx ||||| 00| ||eng c
028 5 2 |a pubmed24n0329.xml 
035 |a (DE-627)NLM098643924 
035 |a (NLM)9929274 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Ohno-Machado, L  |e verfasserin  |4 aut 
245 1 0 |a Improving machine learning performance by removing redundant cases in medical data sets 
264 1 |c 1998 
336 |a Text  |b txt  |2 rdacontent 
337 |a ohne Hilfsmittel zu benutzen  |b n  |2 rdamedia 
338 |a Band  |b nc  |2 rdacarrier 
500 |a Date Completed 16.03.1999 
500 |a Date Revised 10.12.2019 
500 |a published: Print 
500 |a Citation Status MEDLINE 
520 |a Neural network models and other machine learning methods have successfully been applied to several medical classification problems. These models can be periodically refined and retrained as new cases become available. Since training neural networks by backpropagation is time consuming, it is desirable that a minimum number of representative cases be kept in the training set (i.e., redundant cases should be removed). The removal of redundant cases should be carefully monitored so that classification performance is not significantly affected. We made experiments on data removal on a data set of 700 patients suspected of having myocardial infarction and show that there is no statistical difference in classification performance (measured by the differences in areas under the ROC curve on two previously unknown sets of 553 and 500 cases) when as many as 86% of the cases are randomly removed. A proportional reduction in the amount of time required to train the neural network model is achieved 
650 4 |a Journal Article 
650 4 |a Research Support, Non-U.S. Gov't 
650 4 |a Research Support, U.S. Gov't, P.H.S. 
700 1 |a Fraser, H S  |e verfasserin  |4 aut 
700 1 |a Ohrn, A  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t Proceedings. AMIA Symposium  |d 1998  |g (1998) vom: 13., Seite 523-7  |w (DE-627)NLM098642928  |x 1531-605X  |7 nnns 
773 1 8 |g year:1998  |g day:13  |g pages:523-7 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |j 1998  |b 13  |h 523-7