For the last three weeks I’ve been experiencing the real adventure of being a researcher. Before, I was rather confident with my data analysis skills. I constantly developed my knowledge in this area, but it was rather an extension of the methods than unexpected discoveries.

But then, three weeks ago, my state of the art was demolished right from its foundations.

During the last semester I attended an interesting course on *Theory and Methods of Inference* as part of my first year PhD studies. At the end of the course, we had to prepare a project. Each of us was given a scientific paper. We had to write a review of this paper and 3-4 other papers dealing with a similar subject, in order to have a good overview on the particular field. I got the following publication: Kabaila, Welsh and Abeysekera (2016) – “Model-Averaged Confidence Intervals”.

This homework seemed difficult to me at the beginning. The first reading was like hitting a wall with my head. I could hardly understand the main idea. However, while digging deeper into the subject and by tracking other papers from the references, I finally reached – and understood – the core of the issue. Consequently, I reverted my search back through the reference articles to the starting paper. The way back made (almost) everything clear.

Despite understanding, the paper left me rather puzzled. The articles suggested that a large part of my data analyses might have been incorrect! Of course, this started a war inside myself. I had to choose between staying in my old, comfortable, well known zone of statistics (that could lead to incorrect inference) or cross over to the other side of the spectrum and claim my former approach to be incorrect. It was harsh, but I had to choose the second option.

Let me end this too long, philosophical introduction and focus on the details. The issue was the following:

The main goal for most data analysis problems is to determine the appropriate model, which explains the observed values for a given data set. The specifics of this model allow the researchers to understand the data. The appropriate choice of the correct model is a very complex task and is known as *model selection*.

The most common practice in applied statistics is to select a data-driven model based on preliminary hypothesis tests or by minimizing an information criterion (the Akaike Information Criterion (AIC) is the most commonly used). The AIC is, by definition, a numerical value, used to create a ranking of competing models in terms of information loss in approximating the unknowable truth. This approach enables the user to make unconditional inferences from a specific model.

The AIC is calculated as

where *L* is the maximum likelihood estimate for a model and *k* is the number of fitted parameters. The model with the lowest AIC value should be chosen as the best approximating model.

Other methods for model selection based on cross-validation or bootstrapping exist, but they are computationally intensive. However, in all cases a single model is selected. In the following, the selected model is often used for inference or for construction of confidence intervals, as if it had been given to us *a priori* as the true model.

Breiman called this the “quiet scandal of statistics”. If the selected model is used to construct confidence intervals of a given parameter, this could lead us to incorrect inference. The minimum coverage probability of the interval obtained by this naive method could be far below the nominal coverage probability, as shown in by Kabaila and Giri.

The described, naive method is commonly taught in every data analysis course. But if we have a close look at this approach we see that we indeed use the data twice: for model selection and for building the confidence intervals. Additionally, to account for the variability of the estimated parameters, some variance due to model selection uncertainty should be added.

These arguments convinced me that the naive approach could lead to incorrect inference.

But if not the naive approach then what?

There is quite a lot of literature for multi-model inference that omits the issue of model selection. I’m not going to introduce it in this brief article, but I think it is an important issue to learn. For further reading I recommend the article of Symond and Moussalli, which is well written and without difficult mathematical terminology, understandable also for non-statisticians .

*Feature image taken from http://www.planetminecraft.com/*

22 August 2016 at 10:34

Hi Greg, thanks for this instructive post. To avoid using the data twice would it be ok to divide it in two parts, one for the model extraction and another for the data analysis ?

LikeLike

22 August 2016 at 16:12

Ciao Greg,

I also have a comment. In general, besides using methods like cross-validation etcetera, it should be “normal” to test the coverage properties of the resulting intervals on the model parameters. This can be done by bootstrapping techniques. Yes, still CPU intensive, but hell, we do have CPU nowadays! Otherwise Bayesian methods would largely remain in the closet, but they are now being used everywhere….

Best,

T.

LikeLike

1 November 2016 at 12:50

Dear all,

I’m very glad that statisticians are increasingly aware of that phenomenon.

Usually the 1st impression is like that of Greg Kotkowski (shocked). This was my case, and I have decided to work on this topic. Below are my papers related to model selection uncertain.

Best,

1. Nguefack-Tsague G. and Zucchini W. (2011). Post-model selection inference and model averaging. Pakistan Journal of Statistics and Operation Research Vol 7 N°Sp, pp.347-361

http://www.pjsor.com/index.php/pjsor/article/view/120111003

2. Zucchini W., Claeskens G. and Nguefack-Tsague G. (2011). Model Selection. International Encyclopedia of Statistical Science Part 13: pp. 830-833.

URL: http://www.springerlink.com/content/n13p3q0281322h22/

3. Nguefack-Tsague G., Zucchini W. and Fotso S. (2011). On correcting the effects of model selection on inference in linear regression. Syllabus Review (Sciences) 2(3), 2011 :122-140

4. Nguefack-Tsague G. (2013). An alternative derivation of some commons distributions functions: a post-model selection approach. International Journal of Applied Mathematics and Statistics; 42(12):138-147.

MathSciNet (American Mathematical Society) and Elsevier

http://www.ams.org/mathscinet-getitem?mr=MR3093313

5. Nguefack-Tsague G. (2013). Bayesian estimation of a multivariate mean under model uncertainty. International Journal of Mathematics and Statistics; 13(1):83-92.

MathSciNet (American Mathematical Society) and Elsevier

http://www.ams.org/mathscinet-getitem?mr=MR3021499

6. Nguefack-Tsague G. (2013). On bootstrap and post-model selection inference.

International Journal of Mathematics and Computation; 21(4):51-64.

Voir à MathSciNet (American Mathematical Society) et Elsevier

http://www.ams.org/mathscinet-getitem?mr=MR3062016

7. Nguefack-Tsague G. and Bulla I. (2014). A focused Bayesian information criterion.

Advances in Statistics; Volume 2014, Article ID 504325.

http://dx.doi.org/10.1155/2014/504325

8. Nguefack-Tsague G. (2014). On optimal weighting scheme in model averaging.

American Journal of Applied Mathematics and Statistics; 2(3):150-156.

http://dx.doi.org/10.12691/ajams-2-3-9

9. Nguefack-Tsague G. (2014). Estimation of a multivariate mean under model selection uncertainty. Pakistan Journal of Statistics and Operation Research; 10(1):131-145.

http://dx.doi.org/10.18187/pjsor.v10i1.449

10. Nguefack-Tsague G. and Zucchini W. (2016). A mixture-based Bayesian model averaging method. Open Journal of Statistics; 6:220-228.

http://dx.doi.org/10.4236/ojs.2016.62019

11. Nguefack-Tsague G. and Zucchini W. (2016). Effects of Bayesian model selection on frequentist performances: an alternative approach. Applied Mathematics; 7(10):1103-1115. http://dx.doi.org/10.4236/am.2016.710098

12. Nguefack-Tsague G., Zucchini W., and Fotso S. (2016). Frequentist model averaging and applications to Bernoulli trials. Open Journal of Statistics; 6(3):545-553.

http://dx.doi.org/10.4236/ojs.2016.63046

LikeLike

1 November 2016 at 14:23

Thank you for the helpful links.

LikeLike