Cross-validation is a general computer-intensive approach used in estimating the accuracy of statistical models. The idea of cross-validation is to split the data into N subsets, to put one subset aside, to estimate parameters of the model from the remaining N-1 subsets, and to use the retained subset to estimate the error of the model. Such a process is repeated N times – with each of the N subsets being used as the validation set . Then the values of the errors obtained in such N steps are combined to provide the final estimate of the model error.
The cross-validation is used in various classification and prediction procedures, such as regression analysis , discriminant analysis , neural networks and classification and regression trees (CART) .