Abstract
Akaike's information criterion (AIC) is widely used to estimate the best model from a given candidate set of parameterized probabilistic models. In this paper, considering the sampling error of AIC, a set of good models is constructed rather than choosing a single model. This set is called a confidence set of models, which includes the minimum ε{AIC} model at an error rate smaller than the specified significance level. The result is given as P-value for each model, from which the confidence set is immediately obtained. A variant of Gupta's subset selection procedure is devised, in which a standardized difference of AIC is calculated for every pair of models. The critical constants are computed by the Monte-Carlo method, where the asymptotic normal approximation of AIC is used. The proposed method neither requires the full model nor assumes a hierarchical structure of models, and it has higher power than similar existing methods.
Similar content being viewed by others
References
Aitkin, M. A. (1974). Simultaneous inference and the choice of variable subsets in multiple regression, Technometrics, 16, 221–227.
Akaike, H. (1974). A new look at the statistical model identification, IEEE Trans. Automat. Control, 19, 716–723.
Akaike, H. (1979). A Bayesian extension of the minimum AIC procedure of autoregressive model fitting, Biometrika, 66, 237–242.
Arvesen, J. N. and McCabe, G. P., Jr. (1975). Subset selection problems for variances with applications to regression analysis, J. Amer. Statist. Assoc., 70, 166–170.
Atkinson, A. C. (1970). A method for discriminating between models, J. Roy. Statist. Soc. Ser. B, 32, 323–353.
Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics, Wiley, New York.
Bozdogan, H. (1987). Model selection and Akaike's information criterion (AIC): the general theory and its analytical extensions, Psychometrika, 52, 345–370.
Cox, D. R. (1962). Further results on tests of separate families of hypotheses, J. Roy. Statist. Soc. Ser. B, 24, 406–424.
Dastoor, N. K. and McAleer, M. (1989). Some power comparisons of joint and paired tests for nonnested models under local hypotheses, Econometric Theory, 5, 83–94.
Draper, N. and Smith, H. (1981). Applied Regression Analysis (2nd ed.), Wiley, New York.
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap, Chapman & Hall, New York.
Efron, B., Halloran, E. and Holmes, S. (1996). Bootstrap confidence levels for phylogenetic trees, Proc. Nat. Acad. Sci. U.S.A., 93, 13429–13434.
Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap, Evolution, 39, 783–791.
Felsenstein, J. and Kishino, H. (1993). Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull, Systematic Biology, 42, 193–200.
Gupta, S. S. and Huang, D. Y. (1976). Selection procedures for the means and variances of normal populations: unequal sample sizes case, Sankhyā Ser. B, 38, 112–128.
Gupta, S. S. and Panchapakesan, S. (1979). Multiple Decision Procedures, Wiley, New York.
Hochberg, Y. and Tamhane, A. C. (1987). Multiple Comparison Procedures, Wiley, New York.
Linhart, H. (1988). A test whether two AIC's differ significantly, South African Statistical Journal, 22, 153–161.
Mallows, C. L. (1973). Some comments on C p , Technometrics, 15, 661–675.
Shimodaira, H. (1993). A model search technique based on confidence set and map of models, Proc. Inst. Statist. Math., 41, 131–147 (in Japanese).
Shimodaira, H. (1997a). Assessing the error probability of the model selection test, Ann. Inst. Statist. Math., 49, 395–410.
Shimodaira, H. (1997b). A graphical technique for finding a set of good models using AIC and its variance, Ann. Inst. Statist. Math. (submitted).
Spjøtvoll, E. (1972). Multiple comparison of regression functions, The Annals of Mathematical Statistics, 43, 1076–1088.
Spjøtvoll, E. (1977). Alternatives to plotting C p in multiple regression, Biomctrika, 64, 1–8.
Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses, Econometrica, 57, 307–333.
White, H. (1982). Maximum likelihood estimation of misspecified models, Econometrica, 50, 1–25.
Author information
Authors and Affiliations
About this article
Cite this article
Shimodaira, H. An Application of Multiple Comparison Techniques to Model Selection. Annals of the Institute of Statistical Mathematics 50, 1–13 (1998). https://doi.org/10.1023/A:1003483128844
Issue Date:
DOI: https://doi.org/10.1023/A:1003483128844