All students of statistics encounter confidence intervals. Confidence intervals tell you, roughly, the interval within which you can be, say, 95% confident that the true value of some sample statistic lies. This is not the precise technical definition, but it is how people use the intervals. Confidence intervals to some statistic – say, the mean – calculated from a sample. You would use a confidence interval to communicate the degree of uncertainty about some numerical estimate based on a sample.
A prediction interval, by contrast, is about an individual data point, not a sample statistic. It expresses the degree of uncertainty around a specific prediction from a model, say a linear regression. It is stated in the form “on average we can expect, say, 95% of our predicted values to fall in this interval.” A prediction interval will, naturally, be much wider than a confidence interval (which gets narrower and narrower as you take bigger samples).
A tolerance interval, like a prediction interval, is also about a single data point. It differs from a prediction interval in that we add a second quantification of uncertainty. In a prediction interval, the statement 0.25 “on average we can expect, say, 95% of our predicted values to fall in this interval” implies that half the time more than 95% of the predictions will fall in the interval, and half the time fewer than 95% of the predictions will fall in the interval. A tolerance interval quantifies that first part of the statement – i.e. it says, for example, “90% of the time 95% of the predictions will fall in the interval.” A tolerance interval in which that first uncertainty value is set to 50% is equivalent to a prediction interval.
A tolerance interval is not to be confused with
Tom Ryan’s comprehensive text Modern Engineering Statistics covers these intervals in some detail. Tom developed and taught a number of courses with us at the Institute; he passed away in December, 2016.