The idea of cross-validation is to split the data into N subsets, to put one subset aside, to estimate parameters of the model from the remaining N-1 subsets, and to use the retained subset to estimate the error of the model. Such a process is repeated N times - with each of the N subsets being used as the validation set . Then the values of the errors obtained in such N steps are combined to provide the final estimate of the model error.
The Institute for Statistics Education offers an extensive glossary of statistical terms, available to all for reference and research. To celebrate the International Year of Statistics in 2013, we will provide a statistical term every week, delivered directly to your inbox. Make it your New Year's resolution to improve your own statistical knowledge! Sign up here.
Rather not have more email? Bookmark our "Stats Word of the Week" page.