Best-subset model selection based on multitudinal assessments of likelihood improvements

© 2019 Informa UK Limited, trading as Taylor & Francis Group.

Bibliographische Detailangaben
Veröffentlicht in:Journal of applied statistics. - 1991. - 47(2020), 13-15 vom: 01., Seite 2384-2420
1. Verfasser: Carter, Knute D (VerfasserIn)
Weitere Verfasser: Cavanaugh, Joseph E
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2020
Zugriff auf das übergeordnete Werk:Journal of applied statistics
Schlagworte:Journal Article 62F07 Akaike information criterion Bayesian information criterion likelihood ratio linear models regression variable selection
Beschreibung
Zusammenfassung:© 2019 Informa UK Limited, trading as Taylor & Francis Group.
A common model selection approach is to select the best model, according to some criterion, from among the collection of models defined by all possible subsets of the explanatory variables. Identifying an optimal subset has proven to be a challenging problem, both statistically and computationally. Our model selection procedure allows the researcher to nominate, a priori, the probability at which models containing false or spurious variables will be selected from among all possible subsets. The procedure determines whether inclusion of each candidate variable results in a sufficiently improved fitting term - and is hence named the SIFT procedure. Two variants are proposed: a naive method based on a set of restrictive assumptions and an empirical permutation-based method. Properties of these methods are investigated within the standard linear modeling framework and performance is evaluated against other model selection techniques. The SIFT procedure behaves as designed - asymptotically selecting variables that characterize the underlying data generating mechanism, while limiting selection of spurious variables to the desired level. The SIFT methodology offers researchers a promising new approach to model selection, providing the ability to control the probability of selecting a model that includes spurious variables to a level based on the context of the application
Beschreibung:Date Revised 26.08.2024
published: Electronic-eCollection
Citation Status PubMed-not-MEDLINE
ISSN:0266-4763
DOI:10.1080/02664763.2019.1645097