Multiplicity issues arise in a number of contexts, but they generally boil down to the same thing: repeated looks at a data set in different ways, until something "statistically significant" emerges. See multiple comparisons for how to handle multiple pairwise testing in conjunction with ANOVA. In observational studies, problems arise when many different models are applied to the same data, particularly when a highly specific thesis-to-be-tested is not stated in advance. Stanley Young draws readers attention to a study (Mostofsky et al) in which an association is claimed between certain constituents of air pollution and ischemic stroke (American Journal of Epidemiology, December 27, 1012, letter to the editor). Young points out that, given the number of predictors and adjustors in the data set, 537 million models are possible to construct. How many models were tried before a statistically-significant association was found? Where multiple testing occurs and is properly disclosed, the false discovery rate (the expected number of false "significant" results with a given number of multiple tests under null models) can be used to control for inflated Type I error.
The Institute for Statistics Education offers an extensive glossary of statistical terms, available to all for reference and research. We will provide a statistical term every week, delivered directly to your inbox. To improve your own statistical knowledge, sign up here.
Rather not have more email? Bookmark our "Stats Word of the Week" page.