Hypothesis testing (also called “significance testing”) is a statistical procedure for discriminating between two statistical hypotheses – the null hypothesis (H_{0}) and the alternative hypothesis ( H_{a}, often denoted as H_{1}). Hypothesis testing, in a formal logic sense, rests on the presumption of validity of the null hypothesis – that is, the null hypothesis is rejected only if the data at hand testify strongly enough against it.

The philosophical basis for hypothesis testing lies in the fact that random variation pervades all aspects of life, and in the desire to avoid being fooled by what might be chance variation. The alternative hypothesis typically describes some change or effect that you expect or hope to see confirmed by data. For example, new drug A works better than standard drug B. Or the accuracy of a new weapon targeting system is better than historical standards. The null hypothesis embodies the presumption that nothing has changed, or that there is no difference.

Hypothesis testing comes into play if the observed data do, in fact, suggest that the alternative hypothesis is true (the new drug produces better survival times than the old one in an experiment, for example). We ask the question “is it possible that chance variation might have produced this result?”

As noted, the null hypothesis stands (“is not rejected”) unless the data at hand provides strong enough evidence against it. “Strong enough” means that the probability that you would obtain a result as extreme as the observed result, given that the null hypothesis is true, is small enough (usually < 0.05) given the null hypothesis is true.