Bootstrap techniques for error estimation

The design of a pattern recognition system requires careful attention to error estimation. The error rate is the most important descriptor of a classifier's performance. The commonly used estimates of error rate are based on the holdout method, the resubstitution method, and the leave-one-out m...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 9(1987), 5 vom: 01. Mai, Seite 628-33
1. Verfasser:	Jain, A K (VerfasserIn)
Weitere Verfasser:	Dubes, R C, Chen, C C
Format:	Aufsatz
Sprache:	English
Veröffentlicht:	1987
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Beschreibung
Zusammenfassung:	The design of a pattern recognition system requires careful attention to error estimation. The error rate is the most important descriptor of a classifier's performance. The commonly used estimates of error rate are based on the holdout method, the resubstitution method, and the leave-one-out method. All suffer either from large bias or large variance and their sample distributions are not known. Bootstrapping refers to a class of procedures that resample given data by computer. It permits determining the statistical properties of an estimator when very little is known about the underlying distribution and no additional samples are available. Since its publication in the last decade, the bootstrap technique has been successfully applied to many statistical estimations and inference problems. However, it has not been exploited in the design of pattern recognition systems. We report results on the application of several bootstrap techniques in estimating the error rate of 1-NN and quadratic classifiers. Our experiments show that, in most cases, the confidence interval of a bootstrap estimator of classification error is smaller than that of the leave-one-out estimator. The error of 1-NN, quadratic, and Fisher classifiers are estimated for several real data sets
Beschreibung:	Date Completed 02.10.2012 Date Revised 12.11.2019 published: Print Citation Status PubMed-not-MEDLINE
ISSN:	1939-3539