Chi-square test (or -test) is a statistical test for testing the null hypothesis that the distribution of a discrete random variable coincides with a given distribution. It is one of the most popular goodness-of-fit tests .
For example, in a supermarket, relative frequencies of purchasing 4 brands of tee have been 0.1, 0.4, 0.2, and 0.3 during the last year; during the last week the number of packets sold have been 31, 41, 22, 18 for the 4 brands, respectively. Has the preference changed – i.e. probabilities of purchasing now differs from the last year average preferences, or the deviations in the observed relative frequencies is caused by chance alone?
The chi-square test, besides discrete variables, is often applied to problems involving continuous random variables . In this case, the values of a continuous variable are transformed to a discrete variable with a finite number of values – e.g. the whole range of possible values is split into a finite number of intervals, and every such interval is considered as a discrete value (e.g. age groups “20…29”, “30…39”, etc). Then the chi-square test is applied to the new discrete variable.
For small samples, the classical chi-square test is not very accurate – because the sampling distribution of the statistic of the test differs from the chi-square distribution . In such cases, Monte Carlo simulation is a more reasonable approach. In many cases such simulation can be carried out by creating an artificial sample with the given proportion of values and applying a resampling procedure to this sample. Besides the one-sample chi-square test, there are variants of the test for comparison of the distribution of two or several samples. For these variants, a permutation version of the test is more accurate when at least one sample is small. See more on the use of resampling and permutation in short online courses
and in the online book Resampling: The New Statistics
The chi-square test is based on the chi-square statistic .